#duraspace IRC Log

IRC Log for 2017-10-23

Timestamps are in GMT/BST.

[6:37] -tolkien.freenode.net- *** Looking up your hostname...
[6:37] -tolkien.freenode.net- *** Checking Ident
[6:37] -tolkien.freenode.net- *** Found your hostname
[6:37] -tolkien.freenode.net- *** No Ident response
[6:37] * DuraLogBot (~PircBot@webster.duraspace.org) has joined #duraspace
[6:37] * Topic is 'Welcome to DuraSpace IRC. This channel is used for formal meetings and is logged - http://irclogs.duraspace.org/'
[6:37] * Set by tdonohue on Thu Sep 15 17:49:38 UTC 2016
[12:14] * mhwood (~mhwood@mhw.ulib.iupui.edu) has joined #duraspace
[12:56] * tdonohue (~tdonohue@dspace/tdonohue) has joined #duraspace
[14:16] * mhwood (~mhwood@mhw.ulib.iupui.edu) Quit (Ping timeout: 248 seconds)
[14:29] * mhwood (~mhwood@mhw.ulib.iupui.edu) has joined #duraspace
[14:47] * misilot (~misilot@p-body.lib.fit.edu) Quit (Quit: Leaving)
[14:57] * misilot (~misilot@p-body.lib.fit.edu) has joined #duraspace
[15:40] <DSpaceSlackBot> <mwood> Sorry, I never was able to get a connection on the bus last week.
[15:43] <DSpaceSlackBot> <mwood> I think that I see discussion of how to index an embargoed item after the embargo expires. Why would we not index all items immediately, and then check resource policies when responding to requests? That is: just because some object is known in some index does not mean that a given session is permitted to see it or even to know of its existence.
[15:46] <DSpaceSlackBot> <mwood> Then we don't need an embargo-expired event (which would be difficult to provide). We just need to have the OAI indexing code plugged into the event bus, receiving the same events as other bits, indexing objects as they are installed.
[15:47] <DSpaceSlackBot> <tdonohue> @mwood: I think that's what we essentially settled on, as it's also how Discovery works (everything is indexed, even if it is access restricted...and if you are logged in with access, additional items may appear)
[15:47] <DSpaceSlackBot> <tdonohue> But, that said, this doesn't yet exist
[15:48] <DSpaceSlackBot> <tdonohue> (for embargoed items, that is)
[15:48] <DSpaceSlackBot> <mwood> Sounds good, thanks.
[15:49] <DSpaceSlackBot> <tdonohue> There is still an outstanding question though of how to create an OAI index event...as the OAI code doesn't exist in dspace-api, and all event notification happens there it seems
[15:52] <DSpaceSlackBot> <mwood> So we need some sort of OAI-specific operations when creating an index record? Because writing a record into an index shouldn't be very special.
[15:53] <DSpaceSlackBot> <tdonohue> No, it's more that there's currently two separate Solr indexes... one for Discovery, and one for OAI. The OAI one is maintained separately (via a cron job), and Discovery is automated (via events). The OAI one cannot currently be automated, as event notification seems to require at least some of the code to be in dspace-api
[15:53] <DSpaceSlackBot> <mwood> Well, I should go read the code and find out why we have a special index for OAI in the first place, before saying any more.
[15:54] <DSpaceSlackBot> <tdonohue> But, yes, one option might be to merge these into one index (not sure if that's totally plausible, given the number of export formats OAI-PMH needs). The other is to find a way to simply automate both (may require moving some OAI code into dspace-api, but that's TBD exactly)
[15:56] <DSpaceSlackBot> <mwood> I have a meeting, but will be back in an hour.
[16:03] <DSpaceSlackBot> <sulfrian> I think the oai webapp do not access the database backend. So all metadata from the items have to be in the solr index.
[17:08] <DSpaceSlackBot> <mwood> Perhaps it should ask the database. The webapp. has two tasks: (1) identify the objects which correspond to the query, which gains efficiency from an index; (2) compose a response, which needs full metadata access but for which an index is irrelevant. After (1), we have the UUIDs for the relevant objects, so fetching (2) directly from the database would be reasonably efficient. But that's a significant chan
[18:21] <DSpaceSlackBot> <mwood> Having read the PMH spec., I now wonder why we use Solr at all in the OAI code. PMH will do one of three things: (1) retrieve one specific record by ID; (2) retrieve all records; (3) retrieve all records filtered by date range and/or set identifier. None of those makes me think, "I need full-text indexing to do that." (1) and (2) are obvious, and in (3) the set identifier is an opaque label for a fixed quer
[18:21] <DSpaceSlackBot> side.
[18:24] <DSpaceSlackBot> <tom_desair> I’ve also asked myself that same question.
[21:07] * mhwood (~mhwood@mhw.ulib.iupui.edu) Quit (Remote host closed the connection)
[21:41] * tdonohue (~tdonohue@dspace/tdonohue) has left #duraspace

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.