OAI-PMH and "Access to Journal Content" in OJS 3.0

Hi,

I tested turning on “OJS will not be used to publish the journal’s contents online.” and noticed that the article metadata is still visible in OAI-PMH.

Is this expected behaviour?

Hi @ajnyga,

Yes, that’s intended behavior – the OAI interface may still be of use even for journals that aren’t publishing via the front-end. You can disable the OAI interface in config.inc.php.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi,

I know that I can disable OAI from config.inc.php, but as far as I know that would disable it for all journals on our site?

If a journal is using OJS only for editorial process (and is publishing elsewhere), why would they want to publish their metadata through OAI-PMH while it also includes URL’s leading to articles in OJS? I mean, what is the scenario where a journal is not publishing in OJS but still needs OAI-PMH?

Edit: what I mean is, how does the harvesting side know that the articles are not actually available in the URL’s that OAI-PMH is providing? Of course part of the metadata could be still useful.

Hi @ajnyga,

It would probably be a good idea to remove the article URLs etc. from the OAI interface for sites that aren’t using the publishing front-end, agreed. I’ve filed this at Don't expose front-end URLs via OAI when publishing disabled · Issue #2050 · pkp/pkp-lib · GitHub.

Regards,
Alec Smecher
Public Knowledge Project Team

Thanks again @asmecher!

I was just checking trough the OAI-PMH docs if they have some guidelines to these kinds of situations, but it’s already 10 PM in Finland so I will probably continue tomorrow.

Hi @ajnyga,

Guidelines would be welcome if you can find them! It may also be possible to add a new mode to OJS to cause it to only provide information via OAI for journals that have the publishing front-end enabled, but that seems like a niche feature and as a result would probably not receive good maintenance. We’re trying to avoid that situation in OJS 3.x.

Regards,
Alec Smecher
Public Knowledge Project Team

As a native speaker you probably read this better, but my understanding is that a record has to have a URI of some sort (dc:identifier) and the record has to be available through that URI:

To facilitate access to the resource associated with harvested metadata, repositories should use an element in metadata records to establish a linkage between the record (and the identifier of its item) and the identifier (URL, URN, DOI, etc.) of the associated resource. The mandatory Dublin Core format provides the identifier element that should be used for this purpose.

http://www.openarchives.org/OAI/openarchivesprotocol.html

Hi @ajnyga,

That reads to me as the URL being optional. The reference to “mandatory” is related to the DC format being a requirement of any OAI implementation.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi @asmecher

We have ran into trouble with OJS OAI-PMH listing articles for journals that are only using it for editorial use - not publishing articles.

I noticed that for journals that are hidden (not enabled) OAI does not list the records. It is probably handled here https://github.com/pkp/ojs/blob/master/classes/oai/ojs/OAIDAO.inc.php#L241

I realize that there could be cases where records without URL’s could be useful. However, I am suggesting a quick fix for OJS 3.1.1 where we remove the journals that do not use OJS for publishing from OAI altogether. This would probably be an easy fix (addition to the sql above) and make a lot more sense than the current situation.

Maybe just something like this:

	function _getRecordsRecordSet($setIds, $from, $until, $set, $submissionId = null, $orderBy = 'journal_id, submission_id') {
		$journalId = array_shift($setIds);
		$sectionId = array_shift($setIds);

		$params = array();
		$params[] = 'publishingMode';
		if (isset($journalId)) $params[] = (int) $journalId;
		if (isset($sectionId)) $params[] = (int) $sectionId;
		if ($submissionId) $params[] = (int) $submissionId;
		if (isset($journalId)) $params[] = (int) $journalId;
		if (isset($sectionId)) $params[] = (int) $sectionId;
		if (isset($set)) {
			$params[] = $set;
			$params[] = $set . ':%';
		}
		if ($submissionId) $params[] = (int) $submissionId;
		$result = $this->retrieve(
			'SELECT	LEAST(a.last_modified, i.last_modified) AS last_modified,
				a.submission_id AS submission_id,
				j.journal_id AS journal_id,
				s.section_id AS section_id,
				i.issue_id,
				NULL AS tombstone_id,
				NULL AS set_spec,
				NULL AS oai_identifier
			FROM
				published_submissions pa
				JOIN submissions a ON (a.submission_id = pa.submission_id)
				JOIN issues i ON (i.issue_id = pa.issue_id)
				JOIN sections s ON (s.section_id = a.section_id)
				JOIN journals j ON (j.journal_id = a.context_id)
				JOIN journal_settings jsl ON (jsl.journal_id = j.journal_id AND jsl.setting_name=?)
			WHERE	i.published = 1 AND j.enabled = 1 AND jsl.setting_value != 2 AND a.status <> ' . STATUS_DECLINED . '
				' . (isset($journalId) ?' AND j.journal_id = ?':'') . '
				' . (isset($sectionId) ?' AND s.section_id = ?':'') . '
				' . ($from?' AND GREATEST(a.last_modified, i.last_modified) >= ' . $this->datetimeToDB($from):'') . '
				' . ($until?' AND LEAST(a.last_modified, i.last_modified) <= ' . $this->datetimeToDB($until):'') . '
				' . ($submissionId?' AND a.submission_id = ?':'') . '
			UNION
			SELECT	dot.date_deleted AS last_modified,
				dot.data_object_id AS submission_id,
				' . (isset($journalId) ? 'tsoj.assoc_id' : 'NULL') . ' AS assoc_id,' . '
				' . (isset($sectionId)? 'tsos.assoc_id' : 'NULL') . ' AS section_id,
				NULL AS issue_id,
				dot.tombstone_id,
				dot.set_spec,
				dot.oai_identifier
			FROM	data_object_tombstones dot' . '
				' . (isset($journalId) ? 'JOIN data_object_tombstone_oai_set_objects tsoj ON (tsoj.tombstone_id = dot.tombstone_id AND tsoj.assoc_type = ' . ASSOC_TYPE_JOURNAL . ' AND tsoj.assoc_id = ?)' : '') . '
				' . (isset($sectionId)? 'JOIN data_object_tombstone_oai_set_objects tsos ON (tsos.tombstone_id = dot.tombstone_id AND tsos.assoc_type = ' . ASSOC_TYPE_SECTION . ' AND tsos.assoc_id = ?)' : '') . '
			WHERE	1=1
				' . (isset($set)?' AND (dot.set_spec = ? OR dot.set_spec LIKE ?)':'') . '
				' . ($from?' AND dot.date_deleted >= ' . $this->datetimeToDB($from):'') . '
				' . ($until?' AND dot.date_deleted <= ' . $this->datetimeToDB($until):'') . '
				' . ($submissionId?' AND dot.data_object_id = ?':'') . '
			ORDER BY ' . $orderBy,
			$params
		);
		return $result;
	}

Added this as an issue to github and made a pr: [OJS] OAI-PMH should not show results for journals with publishingMode set to PUBLISHING_MODE_NONE · Issue #3026 · pkp/pkp-lib · GitHub

Hi @asmecher. A librarian, who manage a multi-journal OJS instance, is asking us if it is still possible to access the article records of a specific journal when, for this journal, they use the OJS instance only to manage the editorial workflow, not for publishing (In Settings > Distribution > Access, they checked “OJS will not be used to publish the journal’s contents online”).

Thanks!

Hi @pigeonm,

There’s a discussion of that issue in github; I’ll continue it there.

Thanks,
Alec Smecher
Public Knowledge Project Team