Way to clean or manually edit an OHS database

IdeaFix · November 20, 2015, 6:23pm

Hi. I have a test installation of OHS and some sources (dspace, ojs, eprints) on it.

1s time I try to use MODS scheme with dspace, but it was not very good and I decide to use dublincore. I stop (close the tb) an update process, press the button “flush metadata” and delete this siurce.

Now, when I add it, I get not 100% harvest, but 2777 items each time (all archive is about 6000 items).

I try to make “dspace oai clean-cache” on a source side, i try to restart LAMP on OHS side, I try to delete row in the “archives” table of OHS database.

Any way to make a realy flush of all “old” metadata?

asmecher · November 24, 2015, 6:54pm

Hi @IdeaFix,

The “flush metadata” button is the best way to do it. If you’re not getting all records harvested, there are a few common causes…

If several repositories use identical record identifiers, there will be collisions. Make sure each repository uses unique IDs for its records. (This is normally done by setting the archive’s repository ID.)
If you’re using the web-based harvest tool, you might want to try the command-line one instead. Depending on your server, there may be time limits set for web-based requests, so your harvesting might get interrupted.
If your OAI data source encounters an error or generates invalid XML, that might cause the harvester to stop.

Regards,
Alec Smecher
Public Knowledge Project Team

IdeaFix · November 25, 2015, 4:44pm

Hi.

I use web based tool and my repositories have such adresses:

String /handle/123456789/XXX may be same for two repositories.

Pleas help me to understand what mean paremeter “Public ID” in archive properties. I can read “This unique identifier can be used in URL-based searches to identify this archive.” but what should I put on this parameter for example for this repository http://elar.rsvpu.ru/oai/request?verb=Identify or for this http://elar.usfeu.ru/oai/request?verb=Identify

Thank you