CrossCheck full text indexing

journal123 · November 10, 2015, 6:45am

To participate in CrossCheck we must deposit a URL that is to be used for CrossCheck indexing (they call these “as-crawled” URLs). Is the OJS allows full text indexing by AUOMATIC DOI EXPORT? We are using automatic doi export. Then, we have to do manual again, for CrossCheck? Kindly reply.

jmacgreg · November 10, 2015, 7:18pm

Hi there,

If you upgrade to OJS 2.4.7, any material that you then submit to Crossref will have the relevant crosscheck URLs included. You may have to resubmit older contents - just re-export your previously submitted contents, and resubmit the XML.

Cheers,
James

Joao · June 17, 2016, 4:35pm

Hello, we have just upgraded our OJS to 2.4.8 but when we export the metadata xml file to resubmit previously submitted contents, the xml file does not include the “as crawled” URLs.

Our xml is exporting only data-mining tags (example below):

doi_data>
doi>10.17267/2317-3386bjmhh.v1i1.103
resource>https://www5.bahiana.edu.br/index.php/medicine/article/view/103
collection property=“text-mining”>
item>
resource mime_type=“application/pdf”>https://www5.bahiana.edu.br/index.php/medicine/article/viewFile/103/119
/item>
/collection>
collection property=“text-mining”>
item>
resource mime_type=“application/pdf”>https://www5.bahiana.edu.br/index.php/medicine/article/viewFile/103/119
/item>
/collection>
/doi_data>

Instead, Crossref needs the following tags

doi_resources>
doi>10.5555/sampledoi
collection property=“crawler-based”>
item crawler=“iParadigms”>
resource>http://www.yoururl.org/article1_.html
/item>

Please, help.

Joãoindent preformatted text by 4 spaces

ctgraham · June 22, 2016, 12:17pm

Are you concerned with the URLs, or with the property and crawler attributes, or both?

Joao · June 22, 2016, 12:32pm

Hello, we need both.

Thanks,

João de Deus Barreto Segundo
Scientific Communications Analyst
SCU - Scientific Communications Unit
BAHIANA - School of Medicine and Public Health
(55 71) 2101.1916 | (55 71) 993.146.893
www.bahiana.edu.br/revistashttp://www.bahiana.edu.br/revistas
www.repositorio.bahiana.edu.brhttp://www.repositorio.bahiana.edu.br

ctgraham · June 22, 2016, 12:40pm

Can you explain in more detail what you are expecting?

For the URLs, are you expecting <resource> URLs which are the default URLs for OJS, or something different?

For the attributes, can you describe what errors you are seeing with Crossref or elsewhere?

Joao · June 28, 2016, 5:21pm

Hello, according to Crossref we need the output xml file to have the following lines (in yellow below).

Thanks,

João

Shayn Smulyan (Support)

Jun 28, 12:02 EDT

Dear João,

I took at look at your updated deposit. From what I can tell so far, the reason the addition of your as-crawled url failed is that the opening <item> tags are missing the attribute:
crawler=“iParadigms”

So, for example, in the deposit for DOI 10.17267/2317-3386bjmhh.v4i1.754 the <collection> element you deposited was:

<collection property="crawler-based">
<item>
<resource mime_type="application/pdf">
https://www5.bahiana.edu.br/index.php/medicine/article/viewFile/754/558
</resource>
</item>
</collection>

If you add the crawler=“iParadigms” attribute, that would look like:

<collection property="crawler-based">
<item crawler="iParadigms">
<resource mime_type="application/pdf">
https://www5.bahiana.edu.br/index.php/medicine/article/viewFile/754/558
</resource>
</item>
</collection>

Please try editing your XML file to include crawler=“iParadigms” in all of the as-crawled URL <item> tags, and resubmit the deposit. If you continue to get an error message, please let me know, and include the submission log that’s emailed to you with the relevant error.

Best,
Shayn

ctgraham · June 28, 2016, 8:28pm

If I follow up with Crossref on this, do you have a Ticket number I can reference?

Joao · June 28, 2016, 8:48pm

I do not have a ticket number that you can reference, but they may locate our situation in their database through our DOI prefix.

It is the 10.17267 (BAHIANA’s).

I have been in touch with them through the email support@crossref.org.

Thanks,

João de Deus Barreto Segundo
Scientific Communications Analyst
SCU - Scientific Communications Unit
BAHIANA - School of Medicine and Public Health

ctgraham · June 29, 2016, 3:47pm

I’ve just confirmed that this attribute is already in OJS as of 2.4.7. Did you use the “Full Package” method for your install? Check your install to verify the accuracy of this file:

github.com

pkp/ojs/blob/ojs-stable-2_4_8/plugins/importexport/crossref/classes/CrossRefExportDom.inc.php#L470


		return $componentListNode;
	}

	/**
	 * Generate doi_data element - this is what assigns the DOI
	 * @param $doc XMLNode
	 * @param $DOI string
	 * @param $url string
	 * @param $galleys array
	 */
	function &_generateDOIdataDom(&$doc, $DOI, $url, $galleys = null) {
		$journal =& $this->getJournal();
		$request = Application::getRequest();
		$DOIdataNode =& XMLCustomWriter::createElement($doc, 'doi_data');
		XMLCustomWriter::createChildWithText($doc, $DOIdataNode, 'doi', $DOI);
		XMLCustomWriter::createChildWithText($doc, $DOIdataNode, 'resource', $url);

		/* article galleys */
		if ($galleys) {
			// iParadigms collection element
			foreach ($galleys as $galley) {

Joao · June 29, 2016, 4:32pm

I’ll have it checked with our IT department and get back to you asap.

Thank you,

João

Joao · June 30, 2016, 5:11pm

Hello again, I’ve just talked to the IT team and the installation followed proper procedures. We are currently using the 2.4.8 version and it does automatically send metadata to the Crossref database with the proper as-crawled urls as you said in the previous email.

But Crossref has requested us to resubmit the metadata from our previously published content with the as crawled urls in it too so that the Similarity Check may properly scan the full text of the pdf files we have already published before the upgrade to version 2.4.8 (which was just implemented this June).

As instructed by Crossref, we have been trying to export and resubmit this data “manually” through the Crossref xml export plugin because sending the metadata automatically via plugin doesn’t seem to include the previously published content’s as-crawled urls. And, according to our IT team, the text mining instruction in the plugin source code is telling that the crawler-based instruction is to be overwritten by the text mining instruction during the generation of the xml files, resulting in an output file with the text-mining instruction written twice (instead of both the crawler-based and the text-mining lines).

Thanks,

João