Way to clean or manually edit an OHS database

Hi. I have a test installation of OHS and some sources (dspace, ojs, eprints) on it.

1s time I try to use MODS scheme with dspace, but it was not very good and I decide to use dublincore. I stop (close the tb) an update process, press the button “flush metadata” and delete this siurce.

Now, when I add it, I get not 100% harvest, but 2777 items each time (all archive is about 6000 items).

I try to make “dspace oai clean-cache” on a source side, i try to restart LAMP on OHS side, I try to delete row in the “archives” table of OHS database.

Any way to make a realy flush of all “old” metadata?

Hi @IdeaFix,

The “flush metadata” button is the best way to do it. If you’re not getting all records harvested, there are a few common causes…

  • If several repositories use identical record identifiers, there will be collisions. Make sure each repository uses unique IDs for its records. (This is normally done by setting the archive’s repository ID.)
  • If you’re using the web-based harvest tool, you might want to try the command-line one instead. Depending on your server, there may be time limits set for web-based requests, so your harvesting might get interrupted.
  • If your OAI data source encounters an error or generates invalid XML, that might cause the harvester to stop.

Regards,
Alec Smecher
Public Knowledge Project Team

Hi.

I use web based tool and my repositories have such adresses:

  1. http://hostname1/handle/123456789/XXX
  2. http://hostname2/handle/123456789/XXX

String /handle/123456789/XXX may be same for two repositories.

Pleas help me to understand what mean paremeter “Public ID” in archive properties. I can read “This unique identifier can be used in URL-based searches to identify this archive.” but what should I put on this parameter for example for this repository http://elar.rsvpu.ru/oai/request?verb=Identify or for this http://elar.usfeu.ru/oai/request?verb=Identify

Thank you