#duraspace IRC Log


IRC Log for 2010-07-21

Timestamps are in GMT/BST.

[0:08] * ksclarke1 (~kevin@adsl-235-128-38.clt.bellsouth.net) has joined #duraspace
[0:10] * ksclarke (~kevin@adsl-235-32-25.clt.bellsouth.net) Quit (Ping timeout: 260 seconds)
[2:53] * ksclarke1 is now known as ksclarke
[2:53] * ksclarke is now known as Guest65996
[3:03] * Guest65996 (~kevin@adsl-235-128-38.clt.bellsouth.net) Quit (Quit: Leaving.)
[6:11] * mdiggory (~mdiggory@ip72-199-217-116.sd.sd.cox.net) Quit (Quit: mdiggory)
[6:12] * Tonny_DK (~thl@ has joined #duraspace
[11:45] -card.freenode.net- *** Looking up your hostname...
[11:45] -card.freenode.net- *** Checking Ident
[11:45] -card.freenode.net- *** Found your hostname
[11:45] -card.freenode.net- *** No Ident response
[11:45] [frigg VERSION]
[11:45] * DuraLogBot (~PircBot@duraspace.org) has joined #duraspace
[11:45] * Topic is 'Welcome to DuraSpace - This channel is logged - http://duraspace.org/irclogs/'
[11:45] * Set by cwilper on Tue Jun 30 20:32:05 UTC 2009
[12:16] * mhwood (~mhwood@2001:18e8:3:171:218:8bff:fe2a:56a4) has joined #duraspace
[13:25] * tdonohue (~tdonohue@c-98-228-50-55.hsd1.il.comcast.net) has joined #duraspace
[13:48] * pvillega_ (~pvillega@ has joined #duraspace
[13:52] * pvillega (~pvillega@ Quit (Ping timeout: 276 seconds)
[14:36] * ksclarke (~kevin@adsl-235-128-38.clt.bellsouth.net) has joined #duraspace
[14:57] * JimOttaviani (8dd3ac8d@gateway/web/freenode/ip. has joined #duraspace
[14:57] * JimOttaviani (8dd3ac8d@gateway/web/freenode/ip. has left #duraspace
[16:39] * pvillega_ (~pvillega@ Quit (Remote host closed the connection)
[19:40] * grahamtriggs (~grahamtri@cpc1-stev6-2-0-cust340.9-2.cable.virginmedia.com) has joined #duraspace
[19:55] * mdiggory (~mdiggory@ip72-199-217-116.sd.sd.cox.net) has joined #duraspace
[19:57] * keithgilbertson (~keithgilb@ has joined #duraspace
[19:59] <tdonohue> Hi all -- today's DSpace Dev Mtg agenda: https://wiki.duraspace.org/display/DSPACE/DevMtg+2010-07-21
[19:59] <tdonohue> (if you have anything else to add, let me know)
[20:01] <tdonohue> Ok -- we may as well get started. I had GSoC updates first. I've noticed there's been a lot of activity on the Wiki for the GSoC projects (great to see many of the students using it heavily): https://wiki.duraspace.org/display/DSPACE/Google+Summer+of+Code
[20:01] <tdonohue> Were there any specific GSoC updates anyone wants to share?
[20:01] * JOttaviani (8dd3ac8d@gateway/web/freenode/ip. has joined #duraspace
[20:02] * JOttaviani (8dd3ac8d@gateway/web/freenode/ip. has left #duraspace
[20:03] <mdiggory> First, lets talk about schedule
[20:03] * jimottaviani (8dd3ac8d@gateway/web/freenode/ip. has joined #duraspace
[20:03] <tdonohue> mdiggory: are there any GSoC upcoming meetings to know about (I know this was mentioned as still being worked on at/before OR10)?
[20:03] <tdonohue> mdiggory : we're thinking alike -- go ahead :)
[20:03] <mdiggory> trying not to flood IRC
[20:03] <mdiggory> Monday, August 9 Suggested 'Pencils Down'
[20:04] <mdiggory> Friday, August 20 Final Evaluation Deadline/Final Payment Issued
[20:04] <mdiggory> Monday, August 23 Final Results Announced
[20:04] <mdiggory> Monday, August 30 Students Submit Code Samples
[20:04] <tdonohue> (FYI for all -- this schedule is also embedded on our GSoC wiki page: https://wiki.duraspace.org/display/DSPACE/Google+Summer+of+Code)
[20:04] <mdiggory> All students did receive passing grades for the midterm.
[20:05] <tdonohue> that's great news all around!
[20:05] <mdiggory> But, now I think begins the challenging period of us... how do we finish out this years to have the projects follow through to codebase / trunk where appropriate
[20:05] <mhwood> JIRA incident to track each one?
[20:06] * JoseBlanco (8dd32b9d@gateway/web/freenode/ip. has joined #duraspace
[20:07] <mdiggory> Well, we started out with some failure to setup a JIRA project for GSOc, now I'm feeling that was maybe a bad choice...
[20:08] <mdiggory> so, now that we have material to bring into various levels of core, it would be good to see tickets in appropriate places for those activities "DSpaceJIRA" , "Services JIRA" etc
[20:09] <tdonohue> yea -- it's better to split up into manageable "chunks"
[20:10] <mdiggory> But, simply getting everyone used to creating tickets around the GSOC projects is the first challenge
[20:10] <tdonohue> it'd be harder to "approve" a JIRA issue if it's really just a placeholder for the whole project -- I imagine for some projects only "part" may be trunk-ready, and the rest may need a bit more work
[20:10] <mdiggory> I see Unit testing as a good first candidate
[20:11] <tdonohue> well -- couldn't we consider the creation of final JIRA issues as part of the "final project report" (i.e. the final deliverable)
[20:11] <mhwood> Binding each project more tightly to its place in DSpace than to GSoC also emphasizes that this is "real", not just an exercise. We want to see this stuff become production-ready.
[20:12] <tdonohue> mhwood +1
[20:13] <mdiggory> I agree, that not all the projects will be good candidates for trunk, but ticketing for other projects needs to target appropriate modules projects
[20:14] <mdiggory> so for storage and semeantic web projects we have the "DSpace Services" JIRA project that might be a good home
[20:15] <tdonohue> mdiggory -- yes, I agree with that. That's what I was trying to get at by saying that a "JIRA placeholder" is less useful than splitting up the project and adding separate (but related) JIRA issues into each of the appropriate modules in JIRA
[20:15] <mhwood> Might even want several tickets for a single project if it usefully breaks down into components in varying states of readiness.
[20:16] * mdiggory_ (~mdiggory@ip72-199-217-116.sd.sd.cox.net) has joined #duraspace
[20:16] <tdonohue> exactly, mhwood -- in my opinion, several tickets is much better than one ticket (as most of these projects touch several parts of the codebase)
[20:17] <mdiggory_> hmm what does it mean when your MacBook Pro makes a boing sound then completely freezes...
[20:18] <tdonohue> no idea -- ugh -- don't talk to me about MacBook pro's -- my battery is completely hosed :(
[20:18] <mdiggory_> ok, I was going to say that the REST work, if we are truly targeting 1.
[20:18] <mhwood> Time to go fishing in syslog.
[20:18] <mdiggory_> 1.7 with it, might have a home in the dspace 1.x ticketing
[20:19] <tdonohue> yea -- I was hoping that both REST and some of the very early Unit Testing could start to be pushed towards Trunk (and DSpace 1.7)
[20:19] <mdiggory_> (notes this will open up a discussion into async release process etc)
[20:20] <tdonohue> (but whether my hopes match with reality is another matter -- in any case, JIRA tickets should help us all get a better sense of how "ready" each of these projects are)
[20:21] <kshepherd> morning all, sorry for lateness
[20:21] <tdonohue> hi kshepherd -- still talking GSoC right now
[20:21] <mdiggory_> I think a goal with the REST work is that it support legacy DSpace at the same time as new services portings... IMO it should be pluggable to that degree and I know that Aaron would probably recommend such a target.
[20:22] <mdiggory_> I think there is a point where some destabilization of trunk needs to happen to get the work in... and we need to aggree on that period as much as on the feature freeze dates etc
[20:23] <mdiggory_> sso, for instance, if destabilization happens in sept then feature freeze could happen in oct
[20:23] <mdiggory_> or august/sept
[20:23] <tdonohue> that all makes sense -- I still think the best first step is to have someone (preferrably the students, perhaps with mentor help) to create the appropriate JIRA tickets for each project -- that way we can push them for review and try and come to a quick consensus on next steps
[20:24] <mdiggory_> I agree, but we need to agree on who does that commit work, and I will suggest students over mentors
[20:25] <tdonohue> mdiggory_ you mean actually having the students commit the code to trunk? (or am i misunderstanding)
[20:25] <mdiggory_> yes
[20:26] <mdiggory_> We talked about temporary commit rights etc at the or10 meeting...
[20:26] <tdonohue> hmmm... as much as I'd like to say "yes", that does make me slightly nervous (in that we haven't had a more thorough review of the code by committers to help decide which parts are 'non-controversial')
[20:26] <mdiggory_> my point in suggesting it is to aleaviate yet another gatekeeper role / bottleneck in the process
[20:27] <tdonohue> yes, we did talk about the temporary commit rights at OR10 -- but if i recall, it was tabled for a full-meeting discussion (special topic mtg)
[20:27] <mhwood> Commit to a branch and then merge slightly later?
[20:27] <tdonohue> yea, i definitely see the benefits mdiggory -- i completely agree with minimizing bottleneck (just worried at whether this is controversial amongst our committers)
[20:28] <kshepherd> it makes me slightly nervous ;)
[20:28] <tdonohue> commit to a branch first may be less controversial
[20:28] <mdiggory_> I challenge that this community needs to change its perception of trunk.
[20:28] <kshepherd> on the other hand, i've argued that broken trunks aren't the end of the world
[20:28] <mdiggory_> but and do so respectfully, I know why the first blush is to be conservative.
[20:28] <tdonohue> mdiggory_ i understand your challenge (and honestly, I agree to a point) -- it's just something that needs to be voted on
[20:29] <mdiggory_> of course
[20:29] <tdonohue> so -- if you want to propose this (likely via email, we don't have enough here now), we can bring to a vote and see what happens
[20:29] <kshepherd> if we really can't live with broken trunks, we'll just keep our own branches on our own svn repos anyway
[20:30] <kshepherd> so i think i'd vote for it.. might need to think more though ;)
[20:30] <mdiggory_> mhwood: unless there is a review between the commit to the branch and the merge... I don't see usefullness in that strategy.
[20:30] <tdonohue> I see three options: (1) allow commit to trunk , (2) allow commit to a branch, and later merge to trunk (after more review/testing), (3) wait to move to trunk until review/testing (the old way of doing things -- which sometimes leaves things languishing)
[20:30] <mhwood> The branch is to create space for review, with the code checked into the SCM so we can use all the usual SCM tools for inspecting and manipulating it.
[20:31] <mdiggory_> thats already happening
[20:31] <mdiggory_> but we need to be very cautions about how long those branches sit... deviating from trunk
[20:31] <tdonohue> mdiggory_ it's happening separately right now though (each project has it's own 'branch') -- I think mhwood is suggesting a "merged" branch
[20:32] <mdiggory_> the longer they do so, the lower their probability for inclusion
[20:32] <mhwood> Agreed: *prompt* review and decisions.
[20:32] <tdonohue> agreed -- I think the goal would be that the branch is very temporary -- a place for immediate review and then hopefully move to trunk before 1.7
[20:32] <mdiggory_> tdonohue: I just think thats a not such a good idea, this is really what a trunk is for
[20:33] <mhwood> So where should we be looking? The SCM tree seems to have grown like kudzu lately.
[20:33] <mdiggory_> and I agree with kshepherd, breakage on the trunk is actually a good thing
[20:33] <mdiggory_> is the disturbance that promotes change/improvement
[20:33] <kshepherd> yeah we do need to get less precious about trunk, i think ;)
[20:33] <tdonohue> mdiggory_ yes -- but, do these students have experience with merging? I'm worried that a merge could accidentally break more than we bargained for (but I could just be paranoid)
[20:33] <kshepherd> and we have more code review tools to help with post-commit reviews, too
[20:34] <mdiggory_> tdonohue: that is why we are here as mentors, to assist them in learning such things if they do not
[20:34] <mdiggory_> ie mentoring = helping
[20:34] <mdiggory_> not mentoring = judging
[20:34] <grahamtriggs> depends on the way you want to view it - yes, we could get less precious and more experimental with trunk... but that would mean we should be branching early for releases
[20:35] <mhwood> Yes, be as experimental as we want, so long as there is an island of relative stability *somewhere*.
[20:35] <mdiggory_> grahamtriggs: or we work in phases/periods of stability /instability
[20:35] <mdiggory_> ie... controlled chaos
[20:36] <tdonohue> so, i'll also note here that "breaking trunk" currently is against our recently approved "Guidelines for Committing" -- https://wiki.duraspace.org/display/DSPACE/Guidelines+for+Committing (obviously, we can change these guidelines as we see fit, but we need to have larger discussion)
[20:36] <mdiggory_> So, table for special topic meeting that needs to happen asap?
[20:36] <mdiggory_> so more developers that have an interst in such a policy have time to respond
[20:37] <tdonohue> mdiggory_ : yes, we could even schedule mtg for next week -- but, we need to also send an email proposing these changes (for those unable to attend)
[20:38] <mdiggory_> I volunteer kshepherd ;-)
[20:38] <tdonohue> mdiggory_ : will you take lead on proposing this to dspace-devel today/tomorrow? We can put on agenda for next week's mtg.
[20:38] <kshepherd> hmm? for what?
[20:38] <tdonohue> haha -- you are the one who brought this all up, mdiggory_ :)
[20:39] * kshepherd runs svn delete on trunk and runs away cackling
[20:39] <tdonohue> haha :)
[20:39] <mdiggory_> to write an email about proposing to change the Guidelines to allow more breakage or destabilization phases on trunk
[20:39] <mdiggory_> ok ok ok...
[20:40] <tdonohue> yes -- and explain the background as to *why you propose this* -- which is that the proposal is that the GSoC students commit ready code to trunk
[20:40] <tdonohue> those are two separate issues, I know -- but we might be able to cover both in one special topic mtg
[20:40] <kshepherd> my "break trunk" opinions have mostly been in the context of the way we tend to leave bugfixes going stale in JIRA instead of just committing.. this is a slightly different context so i'd like to read/think more
[20:40] <mdiggory> hey mdiggory_ stop volunteering me for things...
[20:41] * tdonohue uh-oh, mdiggory and mdiggory_ are fighting -- this could get ugly :)
[20:42] <tdonohue> ok -- anything else immediately on GSoC? I notice we only have 20mins left (not that there's much more substantial on the agenda though)
[20:42] <mhwood> kshepherd, that is a bit different. I think many of us tend to want more comment than we give, so changes rot....
[20:42] * kshepherd nods
[20:43] <mdiggory_> but I think it is related
[20:43] <kshepherd> no better way to enourage peer review than accidentally breaking trunk, though ;)
[20:43] <mdiggory_> because it has to do with getting things into trunk, then hardening them when errors arise
[20:43] <kshepherd> so like mdiggory_ says, it can actually force us into action
[20:43] <tdonohue> true -- but we also don't want to harm productivity of our widely distributed group by having a broken trunk for too long :)
[20:43] <mdiggory_> rather than waiting long periods of time and only exceppting the most perfect merges
[20:43] <kshepherd> and will get us used to using atlassian code review tools more
[20:44] <mdiggory_> which raises the bar for contribution
[20:44] <mhwood> Do those tools work now? Must go look again.
[20:44] <kshepherd> tdonohue: heh true, of course.. bear in mind that i think 95% of commits won't actually result in a broken trunk ;)
[20:44] <tdonohue> yea agreed :)
[20:44] <kshepherd> mhwood: heh, i hope so! i think there is a test review in crucible
[20:44] <mdiggory_> yes, crucible is still working
[20:45] <tdonohue> Ok -- agenda is a bit blown apart for today. Though, technically we decided on our next Special Topic Mtg already: https://wiki.duraspace.org/display/DSPACE/DevMtg+2010-07-21
[20:46] <mdiggory_> We can refocus, i've got a task to fire off an email on the topic
[20:46] <tdonohue> So -- the only other thing there was any questions/comments on discussions out of OR10 -- and for those who couldn't make it, if there are things that are not clear in the Dev Mtg notes, it'd be good to know (so we can clarify them)
[20:47] * kshepherd needs to catch up
[20:48] <tdonohue> so, mhwood & kshepherd (and anyone else who couldn't join us) -- please read through the notes when you get a chance -- let me know if there are questions / comments. We came away with several "ToDo's" which will require a full meeting to discuss (special topic mtg)
[20:48] <mhwood> Will do.
[20:49] <tdonohue> Also, at OR10, Thorny Staples of DuraSpace offered to give us all a "Intro to Fedora" webinar if there are any of us interested in attending (and we could record it for later as well). So, I'm interested to hear if this is of general interest -- comments welcome
[20:50] <keithgilbertson> +1 webinar
[20:50] <tdonohue> (and obviously this wouldn't be limited just to DSpace committers -- it'd be an open webinar for anyone in community)
[20:50] <kshepherd> yep will do
[20:50] <mdiggory_> Absolutely of interest
[20:50] <JoseBlanco> I would be interested too.
[20:51] <mhwood> Yes, I need to get somewhat up to speed on Fedora.
[20:51] <tdonohue> ok -- i figured there would be interest -- I'll bring this up again with Thorny and see if we can get something scheduled for everyone (and get it recorded too)
[20:51] <mdiggory_> We could tie together Commiter Rights and Guidelines into one specifal topic meeting
[20:52] <mdiggory_> and Async / Dependencies into one ST meeting
[20:53] <tdonohue> mdiggory_ : makes sense
[20:54] <mdiggory_> Per Fedora Commiter meetings... My timezone has historically made it impossible to attend them. I may get more time the next two months... but what about reports to the team by those who can attend more regularly?
[20:55] <tdonohue> mdiggory_ you still fine with bringing up the subject of the GSoC commits to trunk as an intro to that first discussion? Then we can get the Commit Rights / Guidelines discussed in terms of what you are proposing for GSoC -- this is a good example of why we may need to loosen some of these rules
[20:56] <mdiggory_> For instance, I stepped into the last meeting (albeit rather late) to see about a Special Topic around using Spring to inject modules / plugins... that might lead to further corroboration
[20:56] <kshepherd> i have a meeting now, gotta run, cheers all
[20:57] <mdiggory_> tdonohue: yes, perhaps we start with a wiki page to outline the meeting then use that as material for the email
[20:57] <tdonohue> mdiggory_ : ok, that'd make some sense -- you'll have time to help with this? we really need to get it in everyone's head in the next few days, so that everyone has time to think about it before next week
[20:58] <tdonohue> (I can also help to fill out the wiki page)
[20:59] <mdiggory_> If we collaborate... I can give a few cycles
[20:59] * bojans (~Bojmen@ has joined #duraspace
[21:00] <mdiggory_> ... ok, I think we are setting up to have krnl_ and I discussing dspace-storage api architecture issues in the next hour, so if there are those here that want to listen / participate in on a GSoC meeting. Its an impromptu opportunity.
[21:00] <tdonohue> sounds good -- I want to make sure it captures what you are thinking in regards to GSoC -- cause that really is the thing that needs the most immediate analysis and discussion (as deadlines are looming, etc).
[21:00] <mdiggory_> true, true
[21:00] <tdonohue> ok -- sounds good -- the official Dev Mtg is closed -- take it away mdiggory (and everyone please stick around if you can to hear about GSoC)
[21:01] <mhwood> Wish I could stay.
[21:01] * mhwood (~mhwood@2001:18e8:3:171:218:8bff:fe2a:56a4) has left #duraspace
[21:02] * JoseBlanco (8dd32b9d@gateway/web/freenode/ip. Quit (Quit: Page closed)
[21:02] <mdiggory_> I guess its time to let mdiggory speak again...
[21:03] * mdiggory_ (~mdiggory@ip72-199-217-116.sd.sd.cox.net) Quit (Quit: mdiggory_)
[21:03] <mdiggory> whew, he's gone
[21:04] <mdiggory> If that totally confused you, my laptop had forze and I was using a second machine until it got done "thinkng really hard"
[21:05] <mdiggory> So I gues we can bring to order a meeting around GSoC for those who are here? I see bojans and krnl_ are in the house
[21:06] <mdiggory> krnl_: do you want to bring up the topic again concerning the storage api design in the forum now?
[21:07] <krnl_> ok
[21:07] <krnl_> these are really specific questions, but needs to be answered
[21:08] <mdiggory> (for those needing to catch up please see: https://wiki.duraspace.org/display/DSPACE/GSOC10+-+Backport+of+DSpace+2+Storage+Services+API+for+DSpace+1.x)
[21:09] <krnl_> i tried to incorporate changes i collected from Mark ideas and produces class/interface diagram
[21:09] <krnl_> produced
[21:11] <mdiggory> So, the current model leans towards treating Binray content separately from other properites.
[21:11] <mdiggory> with the IBinaryStorage mixin and the IMetadataStorage mixin
[21:12] <krnl_> it does produce whole similar branch
[21:12] <krnl_> i was thinking whether this is ok
[21:12] <mdiggory> Are there challenges with method signatures having different return types in each mixin?
[21:12] <mdiggory> IE restoreVersion
[21:14] <krnl_> this depends how binary is different from property
[21:14] <mdiggory> Higher up in IBinaryWritableStorage etc, you differentiated between the two mixins by using saveBinary vs saveMetadataProperties
[21:15] <mdiggory> I know it was incomplete in the manner I added it
[21:15] <mdiggory> The intention with Binary was that it would be a "Value" of a property
[21:15] <krnl_> ok
[21:15] <krnl_> that changes something
[21:16] <krnl_> this does allow merging these branches
[21:16] <mdiggory> but there still needs to be a service method like saveBinary that can process the InputStream and create the Binary object
[21:16] <mdiggory> and that whole approach is the manner that JCR 2 treats Binary objects...
[21:17] <mdiggory> there was a considerable "back and forth" between Aaron and I when we were trying to create the orignal api
[21:18] <mdiggory> I wanted (much like you've started to approach here) that Binary and Metadata were different "Representations" of an Entity
[21:19] <mdiggory> Thus there might be entirely separate MetadataStore and Binary Store that associated Entity ids with objects
[21:19] <mdiggory> the same entity id would return different representations given the service you called it on
[21:19] <mdiggory> IE like Bitstream metadata vs Bitstream content...
[21:20] <mdiggory> Aaron leaned more towards treating content as values attached to entities in the same service
[21:20] <mdiggory> IE the JCR way of doing things
[21:20] <mdiggory> there was a comprimise...
[21:21] <mdiggory> we really need to see the content both ways
[21:21] <mdiggory> especially when you start trying to map to fedora
[21:22] <mdiggory> But its important to recognize that services are applied differently at different levels of the application architecture....
[21:23] <mdiggory> Storage Level: Services are specific to the store... IE TupiloStore, FedoraStore, DataSource, etc
[21:24] <mdiggory> Application Level: Services are centric to Function... User, Event, Search, Metadata, Content
[21:25] <mdiggory> So we need to think about where each level maps onto a Legacy DSpace api
[21:26] <mdiggory> Maybe we can get some perspective from bojans...
[21:27] <mdiggory> How do we deal with REST api on Bitstreams at the moment?
[21:27] <bojans> they are aproached through currently available api
[21:27] <bojans> eg dspace-api, content and so
[21:28] <bojans> but I am working to decouple these functions and concetrate them
[21:28] <mdiggory> but how do you represent access to either content vs bitstream metadata?
[21:28] <bojans> so to make the transition to new system easier e.g more transaprent
[21:29] <bojans> you mean how it is accessed, from the user perspective?
[21:30] <mdiggory> ok, from the standpoint of "representations"....
[21:30] <mdiggory> A Bitstream does have metadata, (checksum, size, filename, description)
[21:30] <mdiggory> and a Bitstream is also content
[21:31] <mdiggory> likewise a Bitstream has relations to BitstreamFormat and the Bundle it is in
[21:31] <mdiggory> I found it...
[21:31] <mdiggory> ..../bitstream/{id}/receive
[21:31] <mdiggory> vs
[21:32] <bojans> yes this is representation
[21:32] <mdiggory> ....//bitstream/{id}
[21:32] <mdiggory> see
[21:32] <mdiggory> https://wiki.duraspace.org/display/DSPACE/GSOC10+-+DSpace+REST+API
[21:32] <bojans> no currently it should be /bitstream/{id}.. at least to
[21:32] <bojans> receive is deprecated there
[21:33] <bojans> the same approach as for other entities applies
[21:33] <bojans> e.g. Bitstream has metadata which can be accessed using /bitstram/{id}/[checksum,bundles,size...]
[21:34] <bojans> or everything can be received in one package through /bitstream/{id}
[21:34] <mdiggory> so... /bitstream/{id} returns the content and... we access everything else individually?
[21:34] <bojans> the similar applies for other entities
[21:34] <bojans> actually, this part I havent definitely decided, there are two variants
[21:34] <bojans> 1) bitstream/{id} returns content and other fields
[21:35] <bojans> or 2) it returns only fields and content must be downloaded separately as bitstream/{id}/content
[21:35] <bojans> I am trying to be consistent with this among other access points
[21:35] <mdiggory> that is confusing... makes me think we need something "Linked Data" like here...
[21:36] <mdiggory> so in the linked data approach the following would be the case...
[21:37] <mdiggory> .../bitstream/{id} might return the content, while .../bistream/{id}/representation.ext might return metadata
[21:37] <mdiggory> something like bitstream/{id}/content for the content is an alternative
[21:39] <bojans> from the perspective of the bitstream, at least as it is currently defined, I have understood/perceived content as a "equal" field among other fields in the bitstream entity
[21:39] <bojans> so from this point it goes like /bitstream/{id}/content
[21:39] <bojans> however from the user point content might be field of greater importance etc
[21:39] <mdiggory> Thats what I was looking for as an answer
[21:40] <mdiggory> the duality exists in the REST api as well.
[21:41] <mdiggory> The question is, at some point does a decision need to be made on the choice of approach.
[21:42] <mdiggory> Content as Entity vs Content as property of Entity
[21:43] <mdiggory> This where the object model evolves to....
[21:43] <bojans> I think we may look there also from the perspective of the technical implementation and possible issues
[21:44] <bojans> e.g. content, which is in 100s or 10Ks in size compared to name, format desription or some other property
[21:44] <bojans> and there is also point of its storage implementation
[21:44] <bojans> how it is stored
[21:45] <mdiggory> Yes, we may store metadata locally for fast access while storing content somwhere else (i.e. Cloud)
[21:46] <mdiggory> Legacy DSpace has this with database and assetstore
[21:46] <mdiggory> or database and SRB/iRODS instance
[21:46] <mdiggory> and we regularly do customization that push the content into third part stores...
[21:47] <bojans> and this fact maybe pushes a bit towards content as a entity
[21:47] <mdiggory> then we have tdonohue's work to produce AIP which basically take that metadata representation and serialize it into the storage teir
[21:47] <mdiggory> turning it into content
[21:47] <mdiggory> or at least that is what the original AIP prototype was trying to do
[21:48] <tdonohue> mdiggory -- the new AIP work doesn't do that -- it's for *export* only -- so the new AIP work is just to export content/metadata out of DSpace
[21:49] <tdonohue> and then reimport it back in later on -- so it's for backup/restore functionality essentially.
[21:49] <mdiggory> yea, that is why a qualified the answer further ;-)
[21:49] <tdonohue> (i.e. it doesn't touch storage at all)
[21:49] <mdiggory> sorry if that creates misdirection
[21:50] <mdiggory> So, maybe some thought about this from the Fedora side? In Fedora we have Fedora Objects with properties and datastreams (which also have properties)... whats the rest api for accessing datastreams look like?
[21:51] <mdiggory> https://wiki.duraspace.org/display/FCR30/REST+API#RESTAPI-getDatastreamDissemination
[21:51] <tdonohue> FYI -- Fedora REST API docs: https://wiki.duraspace.org/display/FCR30/REST+API
[21:52] <mdiggory> theres two methods...
[21:52] <mdiggory> https://wiki.duraspace.org/display/FCR30/REST+API#RESTAPI-getDatastream
[21:52] <mdiggory> API-A adn API-M
[21:54] <mdiggory> I don't see a means to get just the datastream properties though
[21:54] <mdiggory> unless thats just a dissemination
[21:55] <tdonohue> I think it *is* just a dissemination (though I could be mistaken)
[21:56] <mdiggory> ok
[21:57] <mdiggory> so, returning to bojans comment eariler
[21:57] <mdiggory> bojans: e.g. content, which is in 100s or 10Ks in size compared to name, format desription or some other property
[21:57] <mdiggory> [2:44pm] bojans: and there is also point of its storage implementation
[21:57] <bojans> yes
[21:58] <mdiggory> Something else that came up at OR10...
[21:59] <mdiggory> was that the what we are trying to do with assembling several stores under the ProvidedStorageService is similar to a project in the Fedora community to introduce a higherlevel storage solution allowing multiple stores to be organized with separate rules for how content is processed when added
[22:00] <mdiggory> trying to recall if it was based on Akubra or not... but... there was discussion on not replicating the wheel in that regard
[22:01] <mdiggory> So we may or may not have need for complex storage assembies in DSpace
[22:02] <mdiggory> But, from an API standpoint, it may be important to focus on the "function of the API" needed in DSpace.
[22:03] <mdiggory> Is it the case we should focus on entirely separate services or continue the mixin approach?
[22:04] <mdiggory> I don't want to dominate the discussion here...
[22:07] <mdiggory> ok, its probibly time to close down the meeting if we are not making further progress
[22:08] <bojans> honestly to say I am not sure if I am catching all thoughts here
[22:08] <krnl_> prolem is that nobody is sure how everything really should be done
[22:08] <mdiggory> the point in question was in reference to your technical implementation view
[22:09] <mdiggory> are content and metadata storage significantly separate functions that they should have entirely separate services rather than just mixin interfaces
[22:10] <bojans> from the technical e.g. implementation view it seems they are
[22:10] <mdiggory> and thus significantly different representations/providers in the rest interface
[22:11] <bojans> but anyways for the rest interface does it need to automatically mean that they should be represented completely separete or independent of bitstream/item?
[22:11] <mdiggory> krnl_: cerainly an area of many questions, yes
[22:12] <mdiggory> bojans: probibly not
[22:12] <bojans> also for the content (and probably metadata) separate service might give other possibilities to.. "evolve" in the future
[22:12] <mdiggory> we see that both JCR and Fedora keep the two close together
[22:12] <bojans> as you mentioned before, clouds and others
[22:13] <mdiggory> so what are the "functional areas" of storage? Metadata and Content?
[22:14] <mdiggory> at OR, we also discussed trying to expose Fedora Content Models up into DSpace...
[22:14] <mdiggory> that suggests another service... a SchemaService
[22:15] <mdiggory> thus theres... Content, Metadata and Schema where Schema contains models for how the Content and Metadata related to eachother
[22:15] <mdiggory> kind of a MetadataSchema that is "type" specific
[22:16] <bojans> ok
[22:16] <mdiggory> we will probably want to implement those quite differenly in cases of Tupelo, JCR, Fedora and Legacy DSpace
[22:17] <tdonohue> i unfortunately have to go (and I'm only 1/2 paying attention anyways) -- but, it seems to me that it might be worth learning from JCR & Fedora (and doing something similar, if possible), especially since we are investigating more integration between Fedora and DSpace. Not sure if that helps or hinders these decisions...
[22:18] <mdiggory> So, mixins,for krnl_ I think the takehome is that while we have mixin interfaces that can be assembles for Metadata vs Binary content, we do ultimately need different "Services" that are gone to in the application for each
[22:19] <mdiggory> but this can be as simple as creating spring aliases to each in the spring configuration
[22:19] <mdiggory> all pointing back to one common singleton
[22:20] * tdonohue (~tdonohue@c-98-228-50-55.hsd1.il.comcast.net) has left #duraspace
[22:20] <mdiggory> or if its easier... keep them separate
[22:20] <mdiggory> I think we lost krnl_ it is 20 minutes after midnight there isn't it?
[22:21] <krnl_> 1:20 to be more exact :)
[22:21] <mdiggory> ok... well, let allow you guys some rest... I'm sure he will follow up later.
[22:59] * bojans (~Bojmen@ Quit (Remote host closed the connection)

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.