#duraspace IRC Log


IRC Log for 2012-05-16

Timestamps are in GMT/BST.

[5:44] * hamslaai1 (~hilton@196-215-9-149.dynamic.isadsl.co.za) has joined #duraspace
[5:51] * hamslaai1 (~hilton@196-215-9-149.dynamic.isadsl.co.za) has left #duraspace
[6:49] -sendak.freenode.net- *** Looking up your hostname...
[6:49] -sendak.freenode.net- *** Checking Ident
[6:49] -sendak.freenode.net- *** Found your hostname
[6:49] -sendak.freenode.net- *** No Ident response
[6:49] * DuraLogBot (~PircBot@atlas.duraspace.org) has joined #duraspace
[6:49] * Topic is '[Welcome to DuraSpace - This channel is logged - http://irclogs.duraspace.org/]'
[6:49] * Set by cwilper!ad579d86@gateway/web/freenode/ip. on Fri Oct 22 01:19:41 UTC 2010
[11:59] * mhwood (mwood@mhw.ulib.iupui.edu) has joined #duraspace
[12:59] * KevinVdV (~kevin@ has joined #duraspace
[12:59] * KevinVdV (~kevin@ Quit (Client Quit)
[13:08] * tdonohue (~tdonohue@c-67-177-108-221.hsd1.il.comcast.net) has joined #duraspace
[13:54] * scottatm (~scottatm@adhcp218.evans.tamu.edu) Quit (Quit: scottatm)
[14:00] * scottatm (~scottatm@adhcp218.evans.tamu.edu) has joined #duraspace
[17:01] <tdonohue> Hi All, my first "DSpace Developers Office Hours" starts now: https://wiki.duraspace.org/display/~tdonohue/DSpace+Office+Hours If you have something you'd like to chat about, let me know!
[17:01] <kompewter> [ DSpace Office Hours - Tim Donohue - DuraSpace Wiki ] - https://wiki.duraspace.org/display/~tdonohue/DSpace+Office+Hours
[17:01] <tdonohue> (In the meantime, I'll just be lurking here & multi-tasking)
[18:11] * ryscher (98033b9f@gateway/web/freenode/ip. has joined #duraspace
[18:52] <ryscher> tdonohue: Office hours seem quiet so far....I have two quick questions related to licensing.
[18:53] <tdonohue> sure, go ahead, ryscher. I'm around, just multitasking :)
[18:55] <ryscher> 1.) Currently, www.dspace.org/license is a 3-clause BSD license. I thought DSpace previously had the 2-clause version. Am I wrong? If not, was this a conscious change?
[18:57] <tdonohue> ryscher: RE: #1, as far as I'm aware it's always been a 3-clause BSD. I just checked the tagged DSpace 1.0 release, and it looks to be the same license there (with MIT & HP mentioned instead of DuraSpace & DSpace): https://svn.duraspace.org/dspace/dspace/tags/dspace-1_0/dspace/LICENSE
[18:58] <ryscher> tdonohue: ok, good. My mistake. I'll update the default Dryad licensing to minimize differences.
[18:59] <ryscher> 2.) The majority of our Dryad code is based on DSpace code, copyright DuraSpace. But for new code we create from scratch, we have been keeping copyright ourselves, and only assigning copyright to DuraSpace for things that move into the main DSpace distribution.
[19:00] <ryscher> We've been discussing whether it is better/easier to just assign copyright to DuraSpace from the outset, even for code that might never make it into the main DSpace. Does DuraSpace have thoughts/objections?
[19:03] <tdonohue> Your #2 sounds like the normal workflow that I've heard many developers use. As to whether DuraSpace would have any objections to assigning immediate copyright, I might want to run it by a few other DuraSpace folks. It sounds OK. But, the one oddity is that we don't actually have the "code" (i.e. it's still your code) and aren't helping maintain it.
[19:03] <tdonohue> A question for you -- is there a reason you can think of why it's "better/easier" to assign immediate copyright to DuraSpace?
[19:04] <ryscher> Yes, it's still our code, but the lines get fuzzier as we move into GitHub, especially if we're doing everything in a fork of DSpace.
[19:04] <tdonohue> The only situation I can think of is that it's code that you *want* to give back to DSpace. In which case, I think it'd be perfectly fine to assign copyright to DuraSpace right away.
[19:05] <tdonohue> ryscher : yea, the lines do get "fuzzier" in GitHub. But, technically if it's code that is only in your fork, then it's not yet "owned" by DuraSpace. It only becomes "owned" by DuraSpace once it is merged back into the main DSpace GitHub: https://github.com/DSpace/DSpace
[19:06] <kompewter> [ DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace
[19:06] <ryscher> So far, our mode has been to create things that meet our needs, and send things to DSpace as the opportunity arises. We don't always know what pieces will be contributed.
[19:07] <mhwood> That's what I do. Code that we intend to contribute to DSpace gets the DuraSpace license block; code we intend to hoard gets a "Copyright nnnn Indiana University" block.
[19:08] <mhwood> Oops, "that" being what tdonohue said.
[19:09] * hpottinger (~hpottinge@mu-162198.dhcp.missouri.edu) has joined #duraspace
[19:09] <ryscher> My thoughts about it being better/easier to assign immediately center around reducing the need to worry about which code gets which copyright. But in practice, it only takes a few seconds to change it
[19:09] <tdonohue> One "hair" to split -- technically, DuraSpace can only provide "copyright protection" on stuff in the main GitHub. I.e. if a single developer writes some questionable/illegal code in their fork & gets sued over it (highly unlikely to ever happen), DuraSpace is not liable.
[19:10] <ryscher> Luckily, most of us are at universities with fancy lawyers :)
[19:10] <tdonohue> That previous statement is an "extreme" scenario -- I doubt it will ever happen. But, that's part of the reason why we don't really assign copyright until DuraSpace takes control :)
[19:11] <tdonohue> yea, that is true, ryscher!
[19:12] * hpottinger observes this is a logged channel... if you have a good lawyer joke, best hold your fire :-)
[19:12] <tdonohue> All in all, I don't think it's a big deal either way (i.e. DuraSpace isn't going to be searching GitHub for code that states copyright wrongly)
[19:13] <mhwood> Heehee, I have a whole page-a-day calendar of lawyer jokes here. (But I guess that the publisher would be unamused if I shared any.)
[19:14] <tdonohue> So, does that help any ryscher? I think the normal practice is what mhwood said: code to contribute can be marked DuraSpace right away (or as soon as you know you will contribute). Code to "hoard" can go under your own copyright. But, it's not really a "strict" rule.
[19:15] <tdonohue> & if you hope to contribute all/most of it, then feel free to start putting that DuraSpace header on whatever you want (I know that Dryad is doing a lot of cool stuff that would be great to get back into DSpace someday)
[19:16] <ryscher> tdonohue: yes, that answers my questions. I'm tying up some loose ends in contractual language, so I wanted to be clear about the range of possibilities.
[19:18] <ryscher> we'll definitely be contributing more content in the future... hopefully keeping our code "hoard" to a minimum.
[19:18] <tdonohue> great to hear :)
[19:29] * KevinVdV (~kevin@d54C154B1.access.telenet.be) has joined #duraspace
[19:42] <hpottinger> quiet office hours so far, tdonohue?
[19:44] <tdonohue> hpottinger: yep. has been quiet. But, not a big deal (afterall, it was just announced yesterday). "Office Hours" is just meant to let you all know that this is a timeperiod you are more than welcome to ping me to chat about whatever (technically you can do that whenever I'm in IRC, but I may not always be as responsive depending on what I'm doing)
[19:45] <tdonohue> currently, I'm just doing some other work (multitasking), since no one has anything to chat about today (and that's fine)
[19:46] <tdonohue> but, it was good that ryscher popped in to ask a few questions :)
[19:48] <hpottinger> well, to continue a discussion we were having the other day, the patch I came up with for DS-1174 does not appear to put a stack trace in any log file, alas
[19:48] <kompewter> [ https://jira.duraspace.org/browse/DS-1174 ] - [#DS-1174] exception handling for itemUpdate hides the stack trace, which obstructs troubleshooting efforts - DuraSpace JIRA
[19:49] <tdonohue> Brief reminder to all that our DSpace Developers Meeting starts in 12 mins. https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-05-16
[19:49] <kompewter> [ DevMtg 2012-05-16 - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-05-16
[19:49] <tdonohue> hpottinger: shouldn't it just return the stack trace to the commandline?
[19:50] <tdonohue> i.e. I thought itemUpdate was a commandline tool? Or am I forgetting how it is used in the UI?
[19:50] <hpottinger> tdonohue, that would make sense, but it doesn't do that, either
[19:50] <tdonohue> hmm..that is odd.
[19:50] <hpottinger> Exception processing item 0001: java.lang.NullPointerException
[19:51] <tdonohue> (trying to find the source code again)
[19:52] <hpottinger> I can post an actual patch for review
[19:52] <tdonohue> yea, that might help to attach it to Ds-1174
[19:54] <hpottinger> dspace-api/src/main/java/org/dspace/app/itemUpdate.java is the file
[19:55] <tdonohue> or a Github link, if your fix is up there. Here's the line, if I recall correctly: https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/app/itemupdate/ItemUpdate.java#L456
[19:55] <kompewter> [ DSpace/dspace-api/src/main/java/org/dspace/app/itemupdate/ItemUpdate.java at master · DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/app/itemupdate/ItemUpdate.java#L456
[19:56] <hpottinger> correct, my patch, I've added a line 457
[19:56] <hpottinger> e.printStackTrace();
[19:58] <tdonohue> yep. I thought that should work. but, maybe there's something else going on there that I didn't notice.
[19:58] <tdonohue> unfortunately, we have our mtg about to start here in a few minutes. So, we may have to look at this afterwards
[20:00] * tdonohue start DSpace Developers Meeting
[20:01] <tdonohue> Time to switch gears to our mtg. here's the agenda: https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-05-16
[20:01] <kompewter> [ DevMtg 2012-05-16 - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-05-16
[20:01] <tdonohue> we'll kick off with some JIRA reviews as always
[20:01] <tdonohue> https://jira.duraspace.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+DS+AND+resolution+%3D+Unresolved+AND+Key%3E%3DDS-1008+ORDER+BY+key+ASC
[20:01] <kompewter> [ https://jira.duraspace.org/browse/DS-1008 ] - [#DS-1008] Solr Statistics markRobotsByIP can mark too many IP addresses, including IP&#39;s not on the IP list - DuraSpace JIRA
[20:01] <kompewter> [ Issue Navigator - DuraSpace JIRA ] - https://jira.duraspace.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+DS+AND+resolution+%3D+Unresolved+AND+Key%3E%3DDS-1008+ORDER+BY+key+ASC
[20:01] <tdonohue> starting with DS-1008
[20:01] <kompewter> [ https://jira.duraspace.org/browse/DS-1008 ] - [#DS-1008] Solr Statistics markRobotsByIP can mark too many IP addresses, including IP&#39;s not on the IP list - DuraSpace JIRA
[20:02] <tdonohue> Looks like PeterDie_ already has a patch posted. Does this just need review, Peter?
[20:03] <tdonohue> This sounds like a bug to me, and the fix sounds reasonable
[20:04] * mdiggory (~mdiggory@rrcs-74-87-47-114.west.biz.rr.com) has joined #duraspace
[20:04] <tdonohue> Ok, since Peter isn't around, I suggest moving on from Ds-1008. But, my opinion would be to go ahead and commit this as long as others agree
[20:05] <tdonohue> Next up, DS-1009
[20:05] <kompewter> [ https://jira.duraspace.org/browse/DS-1009 ] - [#DS-1009] SOLR - Provide a mechanism to force a commit. - DuraSpace JIRA
[20:05] <mhwood> The fundamental problem is that string matching doesn't work very well with IP addresses. Note also that IPv6 addresses will cause new pain.
[20:05] <tdonohue> mhwood -- good point on IPv6
[20:07] * mdiggory (~mdiggory@rrcs-74-87-47-114.west.biz.rr.com) Quit (Remote host closed the connection)
[20:07] <mhwood> Ds1009 already assigned to PeterDie_
[20:07] <tdonohue> yea, just wondering if it also needs feedback. Both Ds-1009 and Ds-1008 seem to just be sitting here
[20:07] <tdonohue> (with patches)
[20:08] <tdonohue> any thoughts on whether Ds-1009 is useful? Are others +1 on an option to force a commit from commandline
[20:08] * mdiggory (~mdiggory@rrcs-74-87-47-114.west.biz.rr.com) has joined #duraspace
[20:09] <mhwood> I see that mdiggory had another approach to Ds-1009
[20:09] <mdiggory> Don't I always...
[20:09] <hpottinger> I like mdiggory's suggestion of a commit when the JVM is shutdown
[20:10] <tdonohue> I also like that idea. Is this a separate ticket or should it be a replacement for Ds-1009?
[20:11] <mdiggory> The general method is probibly good in the first place
[20:11] <tdonohue> i.e. is Ds-1009 "good enough" for now, and the next step should be to investigate hooking an auto-commit into JVM shutdown
[20:11] <hpottinger> 1009 looks to be wrapped up, good enough for now
[20:11] <mdiggory> Note, we also use settings in solr to control commit period....
[20:12] <mdiggory> And if solr itself does auto commit on shutdown it might not be necessary in DSpace
[20:13] <tdonohue> ok. Ds-1009 Summary: seems good enough for now. Let PeterDie_ know. (separate ticket perhaps to investigate whether we can perform an auto-commit on JVM shutdown or if Solr already does that)
[20:13] <tdonohue> Last one for today: DS-1012
[20:13] <kompewter> [ https://jira.duraspace.org/browse/DS-1012 ] - [#DS-1012] DSpace Shibboleth authentication module needs to support Lazy Authentication, NetID based authentication, and additional EPerson metadata - DuraSpace JIRA
[20:14] <tdonohue> looks to be assigned scottatm. I assume this must be waiting for feedback as well? Scott are you around?
[20:16] * mdiggory_ (~mdiggory@rrcs-74-87-47-114.west.biz.rr.com) has joined #duraspace
[20:16] * mdiggory (~mdiggory@rrcs-74-87-47-114.west.biz.rr.com) Quit (Quit: Colloquy for iPad - http://colloquy.mobi)
[20:16] * mdiggory_ is now known as mdiggory
[20:16] <mhwood> Looks like Stuart Lewis has tested this a bit.
[20:17] <tdonohue> Oh wait. it looks like DS-1012 is already committed (look at "Git commits" tab of that ticket)
[20:17] <kompewter> [ https://jira.duraspace.org/browse/DS-1012 ] - [#DS-1012] DSpace Shibboleth authentication module needs to support Lazy Authentication, NetID based authentication, and additional EPerson metadata - DuraSpace JIRA
[20:18] <tdonohue> Ds-1012 Summary : Get in touch with scottatm. Can this ticket be closed now? or does it require further work/review?
[20:19] <tdonohue> Ok. we'll stop there for today
[20:19] <tdonohue> Next up a few announcements/notes:
[20:19] <tdonohue> First, we have an OR12 Dev Mtg page up now: https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-07-09+-+OR12+Meeting
[20:19] <kompewter> [ DevMtg 2012-07-09 - OR12 Meeting - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-07-09+-+OR12+Meeting
[20:19] <tdonohue> Please add your name to the signup if you plan to attend! (no real agenda yet, but you are welcome to add in ideas)
[20:20] <tdonohue> As I think everyone here already saw, I'm now starting to hold an "Office Hours" in this IRC channel for the 3hrs before this meeting: https://wiki.duraspace.org/display/~tdonohue/DSpace+Office+Hours So, if you need someone to bounce ideas off of or need some more "eyes on an issue", feel free to stop in
[20:20] <kompewter> [ DSpace Office Hours - Tim Donohue - DuraSpace Wiki ] - https://wiki.duraspace.org/display/~tdonohue/DSpace+Office+Hours
[20:21] <tdonohue> ( you are welcome to ping me though whenever I'm in IRC -- it's just that during "Office Hours", I promise I'll respond promptly!)
[20:21] <tdonohue> Finally -- this week I realized that our JIRA wasn't setup to sync with DSpace GitHub yet. So, our "Git Commits" tab on individual tickets wasn't working. But, that should now be resolved
[20:22] <tdonohue> for example, see the recently merged pull request (Ds-1072): https://jira.duraspace.org/browse/DS-1072?page=com.xiplink.jira.git.jira-git-plugin:git-commits-tabpanel#issue-tabs
[20:22] <kompewter> [ https://jira.duraspace.org/browse/DS-1072 ] - [#DS-1072] Post-registration link has a blank href if hosting DSpace at root instead of at /xmlui - DuraSpace JIRA
[20:22] <kompewter> [ [#DS-1072] Post-registration link has a blank href if hosting DSpace at root instead of at /xmlui - DuraSpace JIRA ] - https://jira.duraspace.org/browse/DS-1072?page=com.xiplink.jira.git.jira-git-plugin:git-commits-tabpanel#issue-tabs
[20:22] <mdiggory> thats good... does it handle showing pull requests at all?
[20:22] <tdonohue> yep. as long as the Pull Request *mentions* the JIRA ticket number
[20:22] * kimiora (kim@ec2-184-73-177-234.compute-1.amazonaws.com) has joined #duraspace
[20:23] <tdonohue> (again, see Ds-1072 link above, as the pull request there mentioned the ticket number)
[20:23] * kimiora is now known as kshepherd
[20:23] <kshepherd> hi all
[20:23] <tdonohue> hi kshepherd, welcome :)
[20:23] <mdiggory> long time kshepherd
[20:23] <kshepherd> mm
[20:23] <mhwood> Hey, everybody, he's here!
[20:24] * kshepherd looks nervous
[20:24] <tdonohue> Ok, that was it for "quick notes". The only other thing on the agenda was more discussion of 3.0 Planning (based on what came out of last week's 3.0 Skype Planning meeting)
[20:25] <tdonohue> I tried to sum of what *I* felt came out of last week's meeting (see agenda: https://wiki.duraspace.org/display/DSPACE/DevMtg+2012-05-16)
[20:26] <mdiggory> I think *you* just about covered it.
[20:26] <tdonohue> essentially 4 points: (1) we are starting to have concerns about long term viability of Cocoon, (2) we seem to prefer that 3.0 should continue as a "gradual evolution" of the platform, (3) though others are welcome to work towards doing deeper re-architecture work as long as it's kept open/shared and (4) we still need to figure out 3.0 details
[20:27] <tdonohue> thanks mdiggory. I only stressed *I*, cause I wasnt' sure if I missed anything or if others came away with something different ;)
[20:27] <mdiggory> I do think we need to be careful how we word what we say about cocoon...
[20:28] <tdonohue> +1 mdiggory
[20:29] <kshepherd> in what way?
[20:29] <mdiggory> the concerns are not with its stability, nor Apache in general, just with its release process and its popularity in relation to other frameworks
[20:29] <kshepherd> oh, true
[20:29] <mdiggory> and some have concerns with its memory footprint
[20:30] <kshepherd> plus, we wouldn't want to offend the thousands of active cocoon maintainers... ;)
[20:30] * kshepherd hides
[20:30] <mdiggory> my point is that we don't want to scare our own community into thinking XMLUI isn't mature or stable in the process of discussing other technologies
[20:31] <mhwood> True.
[20:31] <tdonohue> that, and we need to be clear that Cocoon + XMLUI is still the "primary" UI (most features/most support) and will continue to be so in 3.0. So, we need to also be clear that we're just planning for the future -- this is less about 3.0 specifically and more about the fact that we need to think about building new UIs over the long term option
[20:31] <tdonohue> mdiggory & I just said essentially the same thing :)
[20:31] <mdiggory> well put tdonohue
[20:33] <tdonohue> ok. so, all that being said. What *is* going to be in 3.0? Anything anyone has hiding up their sleeves (or cool ideas to brainstorm)?
[20:33] <tdonohue> Currently, what we have as "tentative" are the @mire / Dryad stuff (which is all very very cool by the way): https://wiki.duraspace.org/display/DSPACE/DSpace+Release+3.0+Notes#DSpaceRelease3.0Notes-NewfeaturesinDSpace3.0%28verytentative%29
[20:33] <kompewter> [ DSpace Release 3.0 Notes - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpace+Release+3.0+Notes#DSpaceRelease3.0Notes-NewfeaturesinDSpace3.0%28verytentative%29
[20:33] <kshepherd> well
[20:34] <mdiggory> tdonohue: and these targets are still open to requirements analysis by the community et al...
[20:34] <kshepherd> i have a few ideas
[20:34] <mhwood> Not interesting to users, but the "portable" test environment is in master now.
[20:35] <mdiggory> for instance, this morning I initiated a dialog on some of the Advanced Embargo Proposal features.... http://sourceforge.net/mailarchive/message.php?msg_id=29274842
[20:35] <tdonohue> right mdiggory -- agreed. they are still tentative based on approval / etc
[20:35] <kompewter> [ SourceForge.net: DSpace: ] - http://sourceforge.net/mailarchive/message.php?msg_id=29274842
[20:35] <tdonohue> mhwood: yeah! good for developers though!
[20:35] <tdonohue> kshepherd -- I'd definitely like to hear your brainstorms/ideas
[20:36] <mdiggory> tdonohue: the goal in listing them and documenting them in the community is to give more opportunity for community feedback earlier in the process at both the business and technical levels.
[20:37] <tdonohue> +1 mdiggory -- yep, I like that open approach to planning/designing/development
[20:39] * tdonohue is waiting for kshepherd's words of wisdom / brainstorms ;)
[20:39] <kshepherd> heh... let me write them down and describe them properly and i'll post them on the list
[20:39] <tdonohue> cool. sounds good to me
[20:40] <kshepherd> one very little thing we've been talking about at UoA is how useful PRISM is in addition to DC for biblio/citation elements
[20:40] <mdiggory> mhwood... so, do we have examples of how to use the testing environment outside of dspace-api?
[20:40] <kshepherd> would be nice to ship with PRISM in the metadata registry
[20:40] <mhwood> kshepherd: there has been interest in PRISM before.
[20:41] <kshepherd> and get that worked into openurl, sfx, etc stuff (plus i've been playing around with CSL and citeproc-js to generate citations in 100s of different styles from record metadata)
[20:41] <mhwood> mdiggory: there will be as soon as I get back to some of the JIRA tickets that have been waiting on testing support.
[20:42] <mdiggory> kshepherd: the challenge is how to deal (similar to mods) with subfields.
[20:42] <tdonohue> ooh -- CSL/citeproc-js. I remember some of that from my old "BibApp" days.
[20:42] <kshepherd> tdonohue: it's quite cool. wish there was a java processor though :P
[20:43] <tdonohue> kshepherd -- I think a lot of that has interest in the community. I know DCAT is also wanting to find ways to add more metadata standards (PRISM & others) to DSpace out-of-the-box. That's what this actually refers to: https://wiki.duraspace.org/display/DSPACE/Develop+support+for+additional+metadata+standards
[20:43] <kompewter> [ Develop support for additional metadata standards - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/Develop+support+for+additional+metadata+standards
[20:43] <kshepherd> mdiggory: haven't encountered any subfields yet.. just standard stuff like prism:startingpage, prism:number, prism:journal... other useful things that aren't straight-forward in DC
[20:44] <mdiggory> PRISM also has to do with entity types.... Persion, Organization, Events, References, ... how do we get out into these types of metadata records? Do we add support for RDF(OWL), XML,XSD/DTD and other types of "records"
[20:44] <tdonohue> kshepherd -- if you are working CSL / citeproc-js into Dspace, I think that also would be of broader interest. They provide a nice clean way to get a formatted citation that could be displayed on an Item view page or similar
[20:44] <mdiggory> were talking about http://www.idealliance.org/specifications/prism/ right?
[20:44] <kompewter> [ Publishing Requirements for Industry Standard Metadata | IDEAlliance ] - http://www.idealliance.org/specifications/prism/
[20:45] <kshepherd> tdonohue: yep, that was my thought
[20:46] <kshepherd> mdiggory: right, but i'm just talking about the prism namespace/schema itself, to start with, perhaps not the whole standards/guidelines
[20:46] <kshepherd> i realise there's more to it
[20:46] <mhwood> We probably do need support for more complex metadata....
[20:47] <mdiggory> I was commenting to tdonohue a little while ago that I now understood why the Fedora developers stuck to "datastreams" and didn't try to solve largescale metadata representation issues.
[20:47] <kshepherd> if i'm going to generate citations out of elements, i need people to use a standard/agreed-upon field for stuff like pagination, parent titles, volume, issue... all that stuff is well-documented as simple PRISM fields
[20:48] <kshepherd> it's not going to solve dspace's metadata problems, but i don't see how it hurts..
[20:48] <tdonohue> kshepherd -- right, the DC schema is lacking in terms of all the fields necessary to generate a citation -- there is no volume, issues, page numbers, etc.
[20:48] <mhwood> kshepherd: I think that would do a lot for DSpace, and more can come later.
[20:48] <mdiggory> kshepherd: but we also see similar support in bibliontology and others... (devils advocate)
[20:49] <tdonohue> +1 mhwood -- I'm one that agrees that small steps are better than no steps (while we wait for something larger)
[20:49] <kshepherd> mdiggory: both! ;)
[20:50] <snail> i think that there are potentially huge wins if we support the forthcoming RDA standard
[20:50] <tdonohue> I fully agree that we need to think about ways to do larger things as well (like support hierarchical metadata). But, I'm all for adding basic PRISM (or other) schemas into out-of-the-box DSpace just to support some additional "flat" fields
[20:51] <tdonohue> RDA (for non-librarians) = http://en.wikipedia.org/wiki/Resource_Description_and_Access
[20:51] <kompewter> [ Resource Description and Access - Wikipedia, the free encyclopedia ] - http://en.wikipedia.org/wiki/Resource_Description_and_Access
[20:51] <mdiggory> I will try not to go into a long production concerning how we should probibly move away from storing metadata in database tables... which would free us up to approach more flexible forms of metadata stored in bitstreams and so-on
[20:52] <tdonohue> (snail -- thanks for the reminder on RDA)
[20:52] * helix84 (a@ has joined #duraspace
[20:53] <tdonohue> mdiggory -- I think that is definitely worth considering, to be honest. But, that would affect so many areas of DSpace obviously
[20:53] <helix84> mdiggory: I would actually like to discuss thjust that sometimes, perhaps after the meeting?
[20:53] <kshepherd> mdiggory: sometimes i love that fedora metadata is just xml datastreams, sometimes i don't...
[20:53] <mdiggory> the take home... maybe we shouldn't add more metadataschema in that traditional DSpace sense.... focus on getting DSpace away from requiring such decisions and heavy engineering work to support richer metadata....
[20:54] <mhwood> It sounded icky until I realized that we can build all the indices we want off of the datastreams.
[20:54] <mhwood> Well, we'd have to rework the metadata "plumbing" either way.
[20:54] <kshepherd> batch editing becomes very icky
[20:55] <mhwood> So maybe the way is to reimagine the metadata store altogether.
[20:55] <helix84> that's what i was thinking
[20:55] <tdonohue> kshepherd is right that batch stuff does become a bit more complex.
[20:55] <mdiggory> kshepherd: batch editing becaomes batch file processing
[20:56] <kshepherd> right, and what tools are out there to support that for end users / repo managers?
[20:56] <kshepherd> i do totally hear you on the richer metadata thing ;)
[20:56] <mdiggory> SWORD ;-)
[20:56] <helix84> should we have a wiki page with ideas?
[20:56] <kshepherd> this is just one hurdle i had when i was last using fedora... my client wanted to keep using spreadsheet software to help manage exports/imports/global changes
[20:57] <mdiggory> sorry kshepherd that wasn't being serious
[20:57] <kshepherd> but once we'd moved to some heirarchical XML schema like MODS or MARCXML, it was too late
[20:57] <tdonohue> but, I also admittedly like the "simplicity" of potentially treating everything as a file, and then indexing the heck out of metadata streams (in Solr or similar) for browse/search/item-view support
[20:57] <snail> SWORD is insufficient for rich metadata
[20:57] <snail> SWORD assumes that docs are indepedent units floating free in space
[20:57] <tdonohue> SWORD is a transmission protocol though -- it doesn't care about metadata inside it. Not sure I understand why it's insufficient?
[20:58] <mdiggory> snail: true, but swordv2 updates can adjust individual bitstreams
[20:58] <kshepherd> yeah tdonohue++
[20:58] <snail> metadata schemes like RDA focus strongly on the connections between docs
[20:58] <mdiggory> without having a great deal of heavy architecture to maintain in DSpace
[20:58] <kshepherd> SWORD isn't about item/metadata management, it's just a way of squirting stuff into a repository
[20:59] <mdiggory> APP is about POSTing and updating blog articles...
[21:00] <tdonohue> I like the idea of doing more things with SWORD though -- just that we'd need to re-do some tools (like batch editor or similar) to become SWORD tools or similar. I.e. this starts to "snowball" into a "let's rebuild DSpace discussion" -- we'd want to plan this out in stages if we decided on this route
[21:00] <kshepherd> i dunno
[21:00] <mdiggory> APP is about posting and updating Items with attachements.... SWORD is about posting and updating Items with attachments
[21:01] <kshepherd> i was just pointing out one positive side-effect of our simple key/value metadata approach so far... we can farm out complicated global changes tasks to external tools like Openoffice/Excel/Refine that handle all that stuff really well
[21:01] <tdonohue> helix84 may have said it best in that we may want to start writing down some of these ideas / hashing out ways to potentially support this concept little by little.
[21:01] <kshepherd> now i know that's no good for complex objects or heirarchical schemas..
[21:01] <mhwood> And then you can have all the tools you want which understand the attachments.
[21:02] <tdonohue> kshepherd -- very true (the openoffice/excel editing is a nice little side-effect)
[21:02] <mdiggory> so the trade off is that we have to create more community tools to manage metadata on the client side... isn't this where other projects are (fedora, Hydra, ...)
[21:03] <hpottinger> so... if we're leaning towards the Fedora model (metadata as another bitstream), and DuraSpace says we should work towards an option of running DSpace on top of a Fedora Repository...
[21:03] <kshepherd> see, i still don't understand that myself..
[21:03] <mhwood> I dunno that DSpace has to come with all that stuff. People who want metadata flavor X can code up something for X, as long as we can store it and (given a suitable interpreter provided by the X people) index it.
[21:03] <kshepherd> i like fedora, but if i want to use it.. it already exists!
[21:05] <KevinVdV> *Needs to run*
[21:05] * KevinVdV (~kevin@d54C154B1.access.telenet.be) Quit (Quit: KevinVdV)
[21:05] <mdiggory> kshepherd: exactly ?!
[21:06] <tdonohue> I think that if we start to find ourselves heading towards a Fedora-like model, then we definitely need to take a step back and say: "Why aren't we just building DSpace as a UI/Java-business-logic on Fedora". But, the key question is whether are architecture is or should be heading towards something like Fedora
[21:06] <kshepherd> so why are we always automatically trying to turn dspace into fedora, or put one inside the other, or whatever?
[21:06] <mdiggory> because we've all learned a lot over the last decade...
[21:07] <helix84> kshepherd: probably because each has some shiny things that the other one doesn't
[21:07] <tdonohue> kshepherd -- good question. Honestly, we don't *have* to try and turn DSpace into Fedora (or use them together). It's one of many possible directions that DSpace could move into...
[21:07] * tdonohue notes we're in overtime..but this is interesting discussion, so I'm gonna keep going
[21:08] <mdiggory> I think because the technology stacks are similar and the problem domains are similar and now... they share the same organization...
[21:08] <kshepherd> i'm just asking questions to force people to re-justify their positions ;)
[21:09] <tdonohue> It is worth noting that while DSpace devs are going through these discussions around "how the DSpace architecture needs to change", Fedora devs are going through the *exact same discussion* in regards to their upcoming Fedora 4.0
[21:09] <tdonohue> i.e. we are both encountering a "grass is always greener" viewpoint at the same time
[21:09] <mdiggory> and now we are talking about REST... seems like they already dealt with REST
[21:10] <mhwood> Any sufficiently developed software winds up needing to play as a component.
[21:10] <kshepherd> yeh so i agree with all of this, what i don't see are the 'shiny bits' in dspace
[21:10] <helix84> kshepherd: XMLUI?
[21:10] <kshepherd> O_o
[21:10] <mhwood> You open a box of DSpace, start it, and it works.
[21:10] <mdiggory> TBH, its got a very big community. Thats pretty shiny.
[21:11] <hpottinger> shiny bits: a much, much better out of the box experience
[21:11] <kshepherd> ok, phew, there are some shiny bits :)
[21:11] <tdonohue> Shiny parts include: big community, out-of-the-box, and we have some "workflow pieces" (submission/workflow stuff) which I've heard Hydra folks at least drool over
[21:12] <mhwood> Oracle has Oracle DB and BerkeleyDB. They do similar things for communities with different requirements. Should Duraspace be like that?
[21:12] <mdiggory> \me imagines a Hydra drooling can get very messy
[21:12] <hpottinger> I've been tinkering a bit with Fedora lately, and I have to tell you, it feels really good to switch back to working with DSpace
[21:12] <tdonohue> mhwood? not sure I understand?
[21:13] <mhwood> Rather than trying to make two products more alike, should we consider their differences as advantages? With two products you can cover more requirements with less complexity in each.
[21:13] <kshepherd> if i understand mhwood correctly, he's meaning that we (and duraspace) can acknowledge that there are many types of repository users... dspace probably solves 90% of needs for people with simple institutional repositories full of theses and journal articles
[21:14] <kshepherd> fedora solves problems for people who want complex relationships between objects, bigger/better metadata schemas, etc
[21:15] <hpottinger> best experience so far with anything Fedora-related is Fedora CloudSync, nice little tool, easy to build and deploy (uses maven), well-documented
[21:15] * kshepherd doesn't even know what he thinks anymore ;)
[21:15] <mdiggory> I don't think we are trying to make DSpace like Fedora, I think we are trying to make it easy to use Fedora with DSpace and gain flexibility where DSpace has limitations by doing so.
[21:15] <tdonohue> Honestly, my view is that DSpace needs to evolve based on the Communities modern needs. If those modern needs make us think towards a Fedora-like architecture, then we should just use Fedora & move our UI & tools to that. But, if they bring us elsewhere, then we shouldn't try and build on Fedora & instead we should look towards other tools
[21:15] <mdiggory> what are our communities modern needs?
[21:16] <mdiggory> less "shoe-horning"?
[21:16] <tdonohue> sorry that should be "community's modern needs"
[21:16] <mdiggory> that too
[21:17] <tdonohue> well, some of those needs are coming straight from DCAT -- we need to support more modern & hierarchical metadata standards. We need to support metadata on all types of objects, etc. etc.
[21:17] <kshepherd> reliable / configurable serialisation to RDFXML? :)
[21:17] <tdonohue> the "modern needs" are those things that we all know that DSpace needs
[21:18] <helix84> at the risk of repeating myself, should there be a wiki page for motivations for using Fedora with DSpace?
[21:18] <tdonohue> to go back to mhwood's point on 'consider the differences as advantages' -- I think DuraSpace already tries to do that. We aren't talking about DSpace & Fedora as similar -- rather they are different products for different needs.
[21:18] <mhwood> But maybe we need to do those things differently, if there are other good ways to do them.
[21:18] <mdiggory> https://wiki.duraspace.org/display/DSPACE/DSpace+Fedora+Collaboration
[21:18] <kompewter> [ DSpace Fedora Collaboration - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpace+Fedora+Collaboration
[21:19] <tdonohue> mhwood -- I'd agree & am all ears for suggestions :)
[21:19] <mdiggory> https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:19] <kshepherd> helix84: can't hurt! personally, i think i'll always be erring towards the view that instead of trying to "use them together", if you want what fedora does, you'll probably be better off just using fedora
[21:19] <kompewter> [ Fedora Integration - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:19] <hpottinger> shoot, this is really interesting, but I gotta go
[21:19] <mhwood> Yes, I have to go too.
[21:20] <tdonohue> yea, this is the latest info on the DSpace + Fedora brainstorms: https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:20] <kompewter> [ Fedora Integration - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:20] <mhwood> Will try to catch up later.
[21:20] <kshepherd> i will point out that the UI we're working on at UoA equally well over dspace+discovery and fedora+rsearch :)
[21:20] * mhwood (mwood@mhw.ulib.iupui.edu) has left #duraspace
[21:20] * hpottinger (~hpottinge@mu-162198.dhcp.missouri.edu) has left #duraspace
[21:20] <kshepherd> i gotta get back to work, too :/
[21:20] <kshepherd> s/equally/equally well/
[21:20] <kompewter> kshepherd meant to say: i will point out that the UI we're working on at UoA equally well well over dspace+discovery and fedora+rsearch :)
[21:20] <helix84> I'm reading https://wiki.duraspace.org/display/DSFED/DSpace-Fedora+Integration+FAQ
[21:20] <kompewter> [ DSpace-Fedora Integration FAQ - DSpace Fedora Collaboration - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSFED/DSpace-Fedora+Integration+FAQ
[21:20] <kshepherd> err
[21:21] * kshepherd stops trying to type :(
[21:21] <helix84> and the motivations seem overly general to me. What I meant was more concrete benefits.
[21:21] <mdiggory> thanks kshepherd
[21:22] <tdonohue> helix84: what sort of things are you looking for? We had some initial "brainstormed" benefits listed here: https://wiki.duraspace.org/display/DSFED/DSpace-Fedora+Integration+FAQ#DSpace-FedoraIntegrationFAQ-WhatdoesintegrationmeanformeasaDSpaceuser%3FWhyshouldIbeinterested%3F
[21:22] <kompewter> [ DSpace-Fedora Integration FAQ - DSpace Fedora Collaboration - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSFED/DSpace-Fedora+Integration+FAQ#DSpace-FedoraIntegrationFAQ-WhatdoesintegrationmeanformeasaDSpaceuser%3FWhyshouldIbeinterested%3F
[21:22] <mdiggory> I thinkw e see features in Fedora that are attractive as solutions to problems we have in DSpace with hierarchical metadata standards and metadata on all types of objects
[21:22] <helix84> tdonohue: exactly that, but there are just 2 concrete benefits there
[21:23] <mdiggory> relationships between DSpace Items and Items with other worldly "things"
[21:23] * tdonohue notes for those who have/had to leave, we can also continue this discussion more next week's meeting (or ping me during "office hours" next Weds too, if you want)
[21:24] <mdiggory> a somewhat generic storage solution various DSO and other DSpace objects
[21:24] <mdiggory> that is very extensible/flexible...
[21:24] <tdonohue> helix84: right, but those were just brainstorms from DuraSpace. When we initially posted this concept of "DSpace with Fedora Inside" we had hoped the community would help us flesh it out more. Unfortunately, it's mostly just sat there as an "idea" that so far hasn't gained traction.
[21:25] <mdiggory> but yet that is also the downfall, because part of DSpace's simplicity is its dependency on a very simple architecture
[21:25] <tdonohue> I think it would be worth brainstorming what other "concrete benefits" there may be & posting them to: https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:25] <kompewter> [ Fedora Integration - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/Fedora+Integration
[21:25] <mdiggory> its a double edged sword
[21:26] <mdiggory> tdonohue: "we had hoped the community would help us flesh it out more"... is where it lost traction...
[21:26] <tdonohue> +1 mdiggory on it being a double-edged sword. If we went the route of "DSpace on Fedora" we'd have to be very careful to keep the DSpace experience "simplistic/easy" and not complicate it too much
[21:26] <mdiggory> its an activity that needs to be led
[21:27] <helix84> probably because (speaking for myself) community doesn't know what it would let us do that is not currently possible
[21:28] <mdiggory> community is looking for leadership and direction on the topic because it is very "pie in the sky" (this coming from someone who even thinks a lot about trying to make it happen)
[21:28] <helix84> when you say versioning and relationships between objects (like journal/articles), I say yeah! when you say flexible, I say bah! :)
[21:28] <tdonohue> it is an activity that needs to be led, but at the same time, I think helix84 is right. It's maybe a bit too "pie in the sky" to get the developers we need to really do it.
[21:29] <mdiggory> helix84: but where you say relationships between journals and articles and I say, not my relationships are between Works and Images.... then maybe flexibility looks more important?
[21:29] <ryscher> The problem is that the primary Fedora "features" aren't things that an average repository manager cares about.... until something breaks. My favorite explanation: http://dltj.org/article/why-fedora-because-you-dont-need-fedora/
[21:29] <kompewter> [ Why Fedora? Because You Don’t Need Fedora | Disruptive Library Technology Jester ] - http://dltj.org/article/why-fedora-because-you-dont-need-fedora/
[21:30] <helix84> of course flexibility is good. I only meant concrete=good motivation / abstract = nice, but what's in it for me
[21:31] <tdonohue> ryscher: you can do that now as well with DSpace AIPs ;) I based the idea of DSpace AIPs off the fact that you needed to be able to do a full restore just like Fedora can do. Though, admittedly, it's not as good as Fedora case you have to *export* those AIPs still
[21:31] <mdiggory> " just program the new system to read these METS-like packages"
[21:31] <mdiggory> may/may not be so "easy"
[21:32] <mdiggory> tdonohue: in the original prototype from MIT, those AIP's were just files in the assetstore (fielsystem) ;-)
[21:32] <tdonohue> mdiggory: in regards to leading "DSpace w/Fedora Inside". I agree completely. It's just that DuraSpace's resources are limited and we need to find a way to leverage the large community if we head this route. So, we need the right "rallying cry" or a way to convince folks better.
[21:33] <helix84> ryscher: that's a nice article
[21:33] <mdiggory> ryscher: and thats where intermediary tools/applications are responsible for filling that gap.
[21:34] <mdiggory> what you see with Hydra and Islandora....
[21:34] <mdiggory> and I think what we are seeing is that DSpace was already in that intermediary space
[21:34] <tdonohue> mdiggory: yea, as much as I wanted to keep around that old "AIPs in the assetstore" MIT code, I wasn't able too -- it wasn't stable enough & at the time I was under time constraints. Still could be an idea to bring back someday/somehow
[21:35] <ryscher> so... while I would love to see "DSpace with Fedora inside" as a new project, I think we can get the same place by taking advantage of opportunities to synchronize, like the aforementioned metadata transition
[21:37] <mdiggory> ryscher: do you mean by putting metadata on more dspace objects, or the point about storing it in bitstreams rather than db tables.
[21:37] <ryscher> storing it in bitstreams
[21:39] <mdiggory> I think the concern we see voice here is that we have InputForms, MetadataRegistry and so on all modeled and UI tooling built on top of... when we say store the metadata in a bitstream, we need an answer as to the impace it will have ont hese features for the end user.
[21:39] <mdiggory> impace = impact
[21:40] * mdiggory wishes he could go back and correct grammar in his IRC posts.
[21:40] <ryscher> yes, to get community buy-in, you need to make the change nearly invisible. But then everyone will ask why the change is needed.
[21:41] <mdiggory> and likewise, I think to get community investment there needs to be a shiny big win as well.
[21:42] <mdiggory> so that question has an answer
[21:42] <mdiggory> the change was needed because we want to store more hierarchical metadata
[21:42] <mdiggory> or relationships between Items without altering the database...
[21:42] <mdiggory> ... schema
[21:44] <mdiggory> better support for the metadata flavor the organization wants to use rather than shoe-horning into an already antiquated approach that DCMI abandoned...
[21:45] <mdiggory> sorry, that was a bit brutal. DC is great...
[21:45] <mdiggory> long live DC
[21:45] <tdonohue> unfortunately, I have to leave in a few :(
[21:45] <tdonohue> haha
[21:46] <mdiggory> great conversation, thanks for the input everyone.
[21:47] <snail> mdiggory: the ideal answer to your question is "dspace needs fedora inside so it can support full RDA"
[21:50] * ryscher (98033b9f@gateway/web/freenode/ip. Quit (Quit: Page closed)
[21:50] <tdonohue> parting words: the thing I've learned from "DSpace w/Fedora Inside" is that these new architecture ideas really need to bubble-up from the community (rather than "top-down" from DuraSpace). DuraSpace can make all the suggestions we want, but we don't have the resources to actually implement them. We need to evolve Dspace towards the immediate needs of the community and have architecture changes bubble-up as needed to meet those
[21:51] <tdonohue> So, a lot of what ryscher was saying about finding the "shiny" and building towards it ;)
[21:51] <tdonohue> ok..gotta go unfortunately. But, great discussion.
[21:52] * tdonohue (~tdonohue@c-67-177-108-221.hsd1.il.comcast.net) Quit (Read error: Connection reset by peer)
[23:23] * helix84 (a@ has left #duraspace

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.