#duraspace IRC Log


IRC Log for 2013-05-15

Timestamps are in GMT/BST.

[6:38] -wolfe.freenode.net- *** Looking up your hostname...
[6:38] -wolfe.freenode.net- *** Checking Ident
[6:38] -wolfe.freenode.net- *** Found your hostname
[6:38] -wolfe.freenode.net- *** No Ident response
[6:38] * DuraLogBot (~PircBot@atlas.duraspace.org) has joined #duraspace
[6:38] * Topic is '[Welcome to DuraSpace - This channel is logged - http://irclogs.duraspace.org/]'
[6:38] * Set by cwilper!ad579d86@gateway/web/freenode/ip. on Fri Oct 22 01:19:41 UTC 2010
[9:25] * fasseg (~fas@HSI-KBW-078-043-007-220.hsi4.kabel-badenwuerttemberg.de) has joined #duraspace
[10:30] * fasseg (~fas@HSI-KBW-078-043-007-220.hsi4.kabel-badenwuerttemberg.de) Quit (Read error: No route to host)
[10:31] * fasseg (~fas@HSI-KBW-078-043-007-220.hsi4.kabel-badenwuerttemberg.de) has joined #duraspace
[12:22] * mhwood (mwood@mhw.ulib.iupui.edu) has joined #duraspace
[13:07] * tdonohue (~tdonohue@c-67-177-111-99.hsd1.il.comcast.net) has joined #duraspace
[13:36] * ksclarke (~kevin@pdpc/supporter/active/ksclarke) has joined #duraspace
[14:05] * PeterDietz (~peterdiet@ has joined #duraspace
[14:57] * hpottinger (~hpottinge@mu-162198.dhcp.missouri.edu) has joined #duraspace
[16:59] * PeterDie_ (~peterdiet@dhcp-140-254-148-230.osuwireless.ohio-state.edu) has joined #duraspace
[17:03] * PeterDietz (~peterdiet@ Quit (Ping timeout: 248 seconds)
[18:19] * PeterDie_ (~peterdiet@dhcp-140-254-148-230.osuwireless.ohio-state.edu) Quit (Remote host closed the connection)
[18:20] * PeterDietz (~peterdiet@ has joined #duraspace
[18:53] * bram-atmire (~bram@94-225-35-170.access.telenet.be) has joined #duraspace
[18:53] <bram-atmire> hi
[19:33] * fasseg (~fas@HSI-KBW-078-043-007-220.hsi4.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[19:41] * helix84 (~a@ip4-95-82-147-170.cust.nbox.cz) has joined #duraspace
[19:58] * bollini (~chatzilla@pD9E846B1.dip0.t-ipconnect.de) has joined #duraspace
[19:59] * robint (522a6b02@gateway/web/freenode/ip. has joined #duraspace
[20:01] <tdonohue> Hi all, it's time for our weekly DSpace Developers Mtg. Agenda up at: https://wiki.duraspace.org/display/DSPACE/DevMtg+2013-05-15
[20:01] <kompewter> [ DevMtg 2013-05-15 - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2013-05-15
[20:02] <tdonohue> As we've done recently, we'll start things off today with a review of a few older Pull Requests
[20:02] <tdonohue> Here's our Pulls, oldest first: https://github.com/DSpace/DSpace/pulls?direction=asc&page=1&sort=created&state=open
[20:02] <kompewter> [ Pull Requests · DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace/pulls?direction=asc&page=1&sort=created&state=open
[20:02] <tdonohue> today we'll start at PR #154 : https://github.com/DSpace/DSpace/pull/154
[20:02] <kompewter> [ Improved thumbnail and branded preview filters as discussed in DS-1259 by jsnshrmn · Pull Request #154 · DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace/pull/154
[20:03] <tdonohue> related to DS-1259
[20:03] <kompewter> [ https://jira.duraspace.org/browse/DS-1259 ] - [#DS-1259] use better image downscaling method in filter-media - DuraSpace JIRA
[20:03] <helix84> i have that assigned, just didn't test it yet
[20:04] <tdonohue> ok, cool. Looks like Joao already said he likes it too
[20:04] <bram-atmire> looks nice
[20:04] <bram-atmire> i mean the idea, didn't look at the code
[20:04] <helix84> yes, he didn't test it, either :)
[20:04] <helix84> we all like the idea, some of us looked at the code, but only the author ran it :)
[20:05] <tdonohue> sounds like a good idea...and the proposed new library is a compatible license. So, if it gets some testing, it sounds like it's good to go.
[20:05] <mhwood> Yes
[20:06] <helix84> if anyone of you thinks you can look at this feel free to grab it because it's not currently a priority for me, although i will get to it sooner or later
[20:07] <tdonohue> Added a comment to the PR, that we generally approve as long as it gets testing. Gonna move on the the next one for now
[20:07] <tdonohue> Next #159: https://github.com/DSpace/DSpace/pull/159
[20:07] <kompewter> [ DS-1433 add site-wide facets for /community-list by helix84 · Pull Request #159 · DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace/pull/159
[20:08] <tdonohue> related to DS-1433
[20:08] <kompewter> [ https://jira.duraspace.org/browse/DS-1433 ] - [#DS-1433] Communities &amp; Collections page in Discovery should show facets - DuraSpace JIRA
[20:09] <helix84> this could use some feedback from you
[20:09] <helix84> do you think it's a good idea to show the same facets on community-list as on the front page?
[20:09] <bollini> mmm... not sure that facets in community-list is clear for the users
[20:09] <bollini> what happen clicking? list of items or communities&collection with matching items?
[20:10] <tdonohue> So, you want the site-wide facets to also be here? http://demo.dspace.org/xmlui/community-list
[20:10] <kompewter> [ Community List ] - http://demo.dspace.org/xmlui/community-list
[20:10] <helix84> same as front page
[20:10] <helix84> tdonohue: right
[20:10] <helix84> it's probably the only content page that doesn't have it
[20:11] <bollini> imho it is confusing. In home page they work as showcase of top "terms" etc. but when you are in the communty-list page you already have started the navigation
[20:11] <tdonohue> just noting that none of the other "Browse-by" pages have the Site-wide facets either (Browse by Title, Author, Subject). In a way, "community-list" is just "Browse by Community"
[20:12] <helix84> tdonohue: that would be because they're still rendered by the browse aspect, not the discovery aspect. i expected that to change, too.
[20:12] <helix84> tdonohue: browse by community sounds confusing - all the other browse by are metadata fields
[20:13] <tdonohue> Ok. yea, as long as we're working towards making things consistent. I don't mind the site-wide facets being on "community-list". Just don't have a strong opinion either way
[20:13] <helix84> any other opinions?
[20:13] <bram-atmire> no strong opinion here
[20:13] <bollini> -1 :-)
[20:13] <mhwood> No, I think it's just the two of you. :-)
[20:14] <helix84> now that's a vote and that is final
[20:14] <bollini> I'm trying to convince you ;-)
[20:14] <bram-atmire> slightly off topic … community listing for Zenodo
[20:14] <bram-atmire> https://zenodo.org/communities/
[20:14] <kompewter> [ Community Collections | ZENODO ] - https://zenodo.org/communities/
[20:14] <bram-atmire> (no facets ;) )
[20:14] <bram-atmire> but then again wouldn't really scale if there are hundred of collections
[20:14] <bollini> in browse and community-list I will like facets but I think that facet should work on shown content
[20:15] <bollini> so facets on community-list should refine the community/collection list
[20:15] <helix84> bollini: frontpage has communities, too
[20:16] <bollini> and facet on a first level browse by author should limit the author names
[20:16] <helix84> bollini: granted, it has recent contributions, too, if you consider those items...
[20:16] <helix84> bollini: but the facets on front page filter the whole site, not just what's on front page (that wouldn't even make sense)
[20:17] <bollini> home page is an exception because it is a show case of all the repository content
[20:17] <bollini> we also put recent submission in the home page
[20:17] * aschweer (~schweer@schweer.its.waikato.ac.nz) has joined #duraspace
[20:18] <helix84> bollini: so what would browse by pages look like after they're migrated to discovery?
[20:18] <helix84> i'm only asking because it was mentioned, not because i think they're the same case as community-list
[20:18] <helix84> trying to get a complete picture
[20:19] <bollini> I'm not understand what you mean for browse by page migrated to discovery
[20:19] <tdonohue> The other place I see facets is on individual Community/Collection pages (e.g. http://demo.dspace.org/xmlui/handle/10673/1 ). In that scenario, those facets let you filter the items within the specific community or collection.
[20:19] <bollini> are you talking about SOLR browse?
[20:19] <kompewter> [ Sample Community ] - http://demo.dspace.org/xmlui/handle/10673/1
[20:20] <helix84> browse by pages in xmlui are rendered by the lucene-based browse aspect, but they'll surely be migrated to discovery (solr)
[20:20] <tdonohue> I only say that cause, that does seem to imply helix84's suggestion that facets (should they appear on 'community-list') would filter the items site-wide (as you are not yet down to a specific Community or Collection)
[20:20] <helix84> tdonohue: correct
[20:21] <helix84> nowhere else we filter communities/collections, always items
[20:21] <tdonohue> in any case, we've probably used too much time on this one (seemingly minor) change. Perhaps we need to bring discussion into the JIRA ticket (Ds-1433)?
[20:21] <bollini> ok
[20:22] <helix84> i'm fine with any way of getting comments, it just seems i got only one :)
[20:22] <tdonohue> If nothing gets decided around this ticket soon(ish), then I'd recommend we revisit later.
[20:23] <helix84> we can move on
[20:23] <tdonohue> helix84: my comment was that I don't feel too strongly, but that your point seems to be valid based on how facets work on individual Community/Collection pages. So, that's a second comment :)
[20:23] <tdonohue> in any case, moving along
[20:24] <tdonohue> Next topic. I left a placeholder here for any discussion/comments/questions around the output from the initial DSpace 3-5year Vision Meeting that took place last week. Committers should have already seen the links. Will paste them here shortly
[20:24] <tdonohue> Meeting notes: https://wiki.duraspace.org/display/DSPACE/DSpace+2013+Vision+and+Roadmap+Meeting
[20:24] <kompewter> [ DSpace 2013 Vision and Roadmap Meeting - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpace+2013+Vision+and+Roadmap+Meeting
[20:25] <tdonohue> (extremely) rough draft vision: https://wiki.duraspace.org/display/DSPACE/DSpace+2013+Vision+Document
[20:25] <kompewter> [ DSpace 2013 Vision Document - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpace+2013+Vision+Document
[20:25] <tdonohue> Everyone is welcome to read through and provide feedback (either publicly or you are welcome to send it privately to me, if you'd rather)
[20:26] <PeterDietz> hi LL
[20:26] <PeterDietz> HI ALL
[20:27] <tdonohue> As I mentioned, next steps are for this group to work towards a better "draft" vision. We'll need to start analyzing it from a technical point of view. There will also be tons of time for discussion at OR13 (both in the Developers Meeting before OR13, and during the DSpace User Group on Thurs, there's a discussion session)
[20:27] <tdonohue> I'm gonna pause here now. Any thoughts/questions/comments/concerns that anyone would like to share here?
[20:28] <mhwood> Roadmap, or generally?
[20:28] <helix84> i like mark's suggestion about iterating on the vision with feedback from both developers and "visionaries"
[20:28] <tdonohue> both, actually :)
[20:29] <helix84> after the brainstorming stage, of course
[20:29] <bram-atmire> I'm still not convinced whether this will end up to be an "evolutionary" or "revolutionary" vision
[20:30] <bram-atmire> either could work
[20:30] <hpottinger> Tim, I wonder if you could comment about where the discussion seemed to "stick" as the goal of this meeting was to draft a vision statement, and that did not happen... it might be helpful to know where the "brainstorming" seems to be.
[20:30] <tdonohue> +1 helix84 & mhwood. I agree, we will need to iterate on the vision with that feedback. I'm just trying to better *nail down* what the vision is. The current rough draft is essentially just a dump of the picture that *I* came away with. It needs review by attendees and a little polishing likely
[20:30] <aschweer> Looks like you did a lot of work at the use case level at that meetings -- I think that's great. As Mark said elsewhere, it can be way too tempting for makers to jump in at the "how do we do" level rather than "what do we do" or "why do we do"
[20:31] <hpottinger> +1 use cases :-)
[20:31] <helix84> i also really like bram's document which concisely shows the use cases for DSpace
[20:32] <bram-atmire> https://wiki.duraspace.org/display/DSPACE/DSpace+Positioning
[20:32] <kompewter> [ DSpace Positioning - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpace+Positioning
[20:32] <aschweer> I only had a very quick look at Bram's document but I agree, I like the idea of checking where DSpace fits in compared to other related(ish) systems
[20:32] <bram-atmire> I was also wondering if there are any institutions who currently have DSpace on the list of software to "phase-out" in the next years & why & what would replace it
[20:32] <tdonohue> hpottinger: in my opinion, the reason we didn't get to the actual drafting of a vision during the meeting was that we got a little "stuck" on the "how do we do" at times (and had to take a step back into use cases). We also had some disagreements around what seem to be the "core" DSpace use cases (and whether DSpace should be simply an IR or something "much bigger")
[20:33] <bollini> I'm still reading the meeting note... just take the opportunity to share a "recent" news (march 2013). I have formally join the CERIF TG from euroCRIS. So I'm definitively interested in the CRIS integration use case
[20:33] <bram-atmire> reversely, it would also be great to know what other systems are on these "phase-out" lists and what it would take for DSpace to fill some of the void created by other phased-out applications
[20:33] <aschweer> good point Bram
[20:33] <helix84> tdonohue: IMHO that's what the vision should primarily be - "whether DSpace should be simply an IR or something "much bigger""
[20:33] <bram-atmire> congrats bollini
[20:33] <tdonohue> hpottinger: so, the main reason was we spent a bit of time "getting on the same page". And, we essentially ran out of time to actually "draft" anything.
[20:34] <helix84> tdonohue: and that's why use cases are so important
[20:34] <aschweer> I'd rather have the use cases as the first outcome than a fully drafted vision to be honest
[20:35] <tdonohue> bram-atmire: I've heard from several institutions (who I don't feel comfortable naming publicly) who have said that they would "phase out" DSpace if the software didn't modernize soon(ish). Some are rather large institutions who say they only stick with DSpace for the community & the fact that they have "sunk costs" & staff built-up expertise there.
[20:36] <mhwood> "Modernize" is a bit fuzzy....
[20:36] <helix84> i can understand that but they could have put it in a more constructive way
[20:37] <aschweer> that's why I like use cases better -- they force people to talk more about what they actually mean. it's easy to say "hey let's do DSpace on Hydra" but essentially that's meaningless unless you say for what purpose
[20:37] <tdonohue> mhwood: I actually meant it to be fuzzy :) I can get more specific...but much of it is stuff we probably all know about DSpace (and much was even listed as "pain points" in the Vision Mtg notes)
[20:37] <bram-atmire> sunk costs & staff built-up expertise = costs of migration that you see everywhere
[20:38] <bram-atmire> if they are in such a position they might migrate if something comes along with many new features or is substantially different
[20:38] <bram-atmire> but I agree = standing still is the same as losing ground, especially in IT
[20:38] <tdonohue> helix84: I'm summarizing. I apologize. Some of the feedback has been pretty specific... even the attendees at the Chicago Meeting gave some pretty specific feedback...which is why some of the meeting notes is anonymized.
[20:40] <tdonohue> But, I'm glad to hear all the agreement on concentrating on "use cases". I agree, that's what I was glad we got semi-drafted already. Obviously a lot more work to do with that (and a lot more feedback to get)
[20:41] <helix84> tdonohue: based on your presence in the vision meeting, is your feeling that dspace should concentrate on being an IR, being core services for different use cases or actually diversify for different use cases?
[20:41] <helix84> (that's what i'd call a stepping stone for the vision)
[20:42] <tdonohue> My personal opinion (and what seemed to be the general consensus): DSpace needs to concentrate on being an IR specificly. However, it should be a modern IR (and not the 10-year old concept of an "IR" that it currently is)
[20:42] <tdonohue> There were major concerns expressed about DSpace trying to "do to much" and therefore not excelling in any *one* thing
[20:42] <mhwood> Is there something to read about what makes an IR "modern"?
[20:42] <aschweer> so I guess the next steps are to refine the use cases more so that we can capture a good representation of "a modern IR"?
[20:42] <helix84> ok, let's put it as a conclusion somewhere
[20:42] <bram-atmire> if things are forking off, it's difficult to get the innovations and good stuff back into DSpace. So you need a group of people working at the "core" that remain interested in this core stuff, rather than working on the forks
[20:43] <PeterDietz> So what about integration with other systems? i.e. Coca-Cola might put their media assets into a DAM (Digital Asset Management system). What if DSpace could add a plugin to enable it to speak DAM?
[20:43] <bram-atmire> boll ini's CRIS dspace, Dryad https://github.com/datadryad/dryad-repo are already examples of good ideas & code for which the path is not so clear to get contributions back into DSpace
[20:43] <tdonohue> mhwood -- much of the use cases describe a "modern IR" in the opinion of the attendees. We're missing things like Versioning, UI for Configs, Support for Streaming, an easier to work with UI, etc etc
[20:43] <kompewter> [ datadryad/dryad-repo · GitHub ] - https://github.com/datadryad/dryad-repo
[20:43] <PeterDietz> same goes for CRIS, SWORD, OAI, ...
[20:44] <mhwood> I would agree that DSpace has become large and complex enough that keeping things out is as important as putting things in. That is, keeping them out of the core product and making it easy to leverage other software that focuses on those aspects.
[20:45] <tdonohue> Some of the most painful (to hear) feedback was that: "It's really hard to get stuff into & out of Dspace. Especially big stuff...video, audio, data files. Once I get stuff into DSpace, it's even harder to get it out" (I'm paraphrasing, but someone actually told me that in Chicago)
[20:45] <hpottinger> As someone who is in the middle of implementing a system built of parts, I can affirm that such an approach leads to all sorts of "interesting problems" resulting from lack of comprehensive knowledge of how all the parts interact
[20:45] <helix84> mhwood: so perhaps we need dspace to be more modular, "buildable upon"?
[20:45] <mhwood> Big stuff is inherently difficult. That doesn't mean we can't make it better, though.
[20:46] <tdonohue> mhwood: they weren't only talking about big stuff though...they meant also in general. Big stuff just shows the problem much easier
[20:46] <mhwood> One problem with big stuff is bending HTTP to cram it in.
[20:46] <helix84> tdonohue: that's one of the things I didn't understand fro te summary - how is it hard to get stuff in and out? we have a handfull of formats we can pour in and spit out. how can we make it easier?
[20:47] <bollini> have they done example of other system that are better than dspace in specific area?
[20:47] <bollini> we can use them as reference to make progress
[20:47] <robint> I'm not clear about what are the differences between an IR and a DAM app that are significant enough that DSpace couldn't do both
[20:47] <tdonohue> helix84: The problem is that most of the tools to get stuff in/out require a sysadmin to run them (non-UI based)
[20:48] <mhwood> OK, we need to refine this broad notion of "hard to get in and out."
[20:48] <hpottinger> I actually asked around for help with "getting big stuff in to repositories is hard"
[20:48] <helix84> tdonohue: thanks, that clarifies a lot
[20:48] <PeterDietz> So, one pain point we have is that to change the theme of a Collection, involves metadata librarian, submitting a "service request", I then act upon that within 7 days, I edit xmlui.xconf, reboot tomcat, and that collection has a new theme. So the distilled use case would be: Repo-Admin to have self-service?
[20:48] <tdonohue> I think the easiest way to "boil down" what I heard is this...
[20:48] <mhwood> Moving configuration stuff into the UI may address that, PeterDietz.
[20:49] <aschweer> I thought we were talking "what" / "why" right now, not "how" ;)
[20:49] <mhwood> Depends, though: "change a theme" meaning use this one rather than that one, or meaning make this one different in some way.
[20:49] <tdonohue> I think most folks want "WordPress, for Repositories". They want to be able to set it up easily. Get going... pick & choose plugins (for more functionality) & themes.... And they want to be able to do all this, plus have normal "IR" functionality, and do it all (or mostly all) from the UI
[20:50] <tdonohue> (I'm not saying that's *easy*...just that gives a good conceptual model)
[20:51] <helix84> i'd say the strong points of wordpress are: 1) holds your hand during installation 2) a lot of made components to pick from 3) huge community
[20:51] <mhwood> PKP products like Open Journal System would be good to look at. They've made plugins fairly easy to use.
[20:51] <tdonohue> In my opinion, that implies many things mentioned above -- keeping DSpace simple (mostly IR use-case), making it modular (plugins), and giving it some polish (better UI, more adminstrative/config tools in the UI)
[20:51] <robint> mhwood: thats are real good example
[20:51] <helix84> i think dspace shares these strong points in the IR area compared to our competitors, but there is still a lot of room to improve in them
[20:53] <tdonohue> we definitely have many advantages in the "IR-area". We have a massive community (at least massive compared to others). We have a ton of potential resources to draw from...we're a "proven" product and known to be the "out-of-the-box" solution
[20:53] <bollini> IMHO our rigid datamodel is the key enemy of plugins that can add easly functionalities
[20:54] <helix84> oh, and one strong poitn wordpress has and we don't is: 4) doesn't require you to bug the repo admin for most stuff
[20:54] <tdonohue> most of what I heard in chicago is that...really, what we need to do is just make sure DSpace is ready to last another 5-10 years (and meet current use cases). Currently in some ways, it's still like it's meeting use cases from 10 years ago
[20:54] <mhwood> Again, what does that mean, specifically?
[20:55] <tdonohue> mhwood: IR from 10 years ago was not so well defined. Often was very "grey literature" & textual publication specific. These days, we have to deal with data, video, audio, complex objects. We also really need *real* versioning
[20:56] <hpottinger> helix84: it's an asset as well as a weakness, though: requiring a programmer to implement customizations is kind of the "siren call" of DSpace: you must pony up resources in order to participate.
[20:56] <tdonohue> mhwood: so, essentially DSpace is showing it's age. It still does great for mostly textual stuff. Once you try and put media into it...it's not the greatest or most "modern" tool around.
[20:56] <mhwood> Second approximation: "deal with data, video..." what's that?
[20:56] <tdonohue> mhwood: plus it's metadata standards are outdated (but we knew that)
[20:56] <PeterDietz> It gives you CRUD for all content types.
[20:57] <PeterDietz> I don't have problems with images / video: https://github.com/osulibraries/DSpace/wiki/XMLUI-Customizations-to-Themes#snazy--kitchen-sink-for-presenting-all-media-types
[20:57] <kompewter> [ XMLUI Customizations to Themes · osulibraries/DSpace Wiki · GitHub ] - https://github.com/osulibraries/DSpace/wiki/XMLUI-Customizations-to-Themes#snazy--kitchen-sink-for-presenting-all-media-types
[20:57] <mhwood> I thought we got Real Versioning in 3.0, but admit I haven't tried it out yet.
[20:57] <tdonohue> mhwood: deal with data, video, audio = uploading/downloading larger content, streaming, versioning, finding more ways to let *users* upload/download larger content themselves (without having to have a sysadmin do it on the backend). Relating data sets back to the related publications, etc.
[20:57] <hpottinger> problem with any large file: nearly impossible to upload
[20:58] <hpottinger> and, as fun as it is to upload, it'll be that much fun to get it back out
[20:58] <tdonohue> Just trying to be the "voice" of what I've heard. I hope I'm representing this all well enough without making things totally fuzzy.
[20:58] <mhwood> Larger content probably wants asynchronous submission. Pull rather than push? (that requires the submitter to have a server of some sort.)
[20:59] <helix84> i think a lot of big data complaints could be solved by providing an uploader that supoprts resuming and progress statusbar (i filed a feature request ticket for that)
[20:59] <tdonohue> hpottinger: yea, those statements were both pointed at as major pain points
[20:59] <hpottinger> SWORD can probably be modified to allow for deposit by reference, then you can at least leverage an upload service that isn't as "painful"
[20:59] <bram-atmire> the globus GridFTP work from university of exeter looks promising
[21:00] <bram-atmire> as well as the zenodo submit files from your dropbox account
[21:00] <bram-atmire> http://www.globusworld.org/files/2013/02-Taylor-Plugging_the_BIG_DATA_Gap_in_DSpace_Using_SWORD_and_Globus.pdf
[21:00] <tdonohue> yea, hooking into things like dropbox could be interesting. It's a place people can get stuff into relatively easily, and we can "pull" from there.
[21:00] <mhwood> We could provide a tool that backgrounds itself and chugs away (using something better for huge files than HTTP)
[21:01] <hpottinger> though, relying on a service to provide upload isn't quite as satisfying as actually allowing large files to be uploaded
[21:01] <PeterDietz> sorry to add another how-to do it. For instance, YouTube allows you to upload 20GB+ files through your browser
[21:02] <tdonohue> Yea, as PeterDietz points out...this is a *solvable* problem. Other services have ways of "chunking up" uploads into smaller chunks so that you can upload larger files. YouTube is not the only one
[21:02] <tdonohue> Realizing the meeting time has just flown by.
[21:02] <helix84> i'd like to make a step back here: i'm hearing a lot of problems dspace has, but it's not like we're hearing any new ones - all of them are identified and most of them are being worked on by somebody. maybe the vision of pushing dspace forward should have more to do whith figuring out how to make it easy/efficient to contribute back to dspace than individual pain points.
[21:04] <tdonohue> helix84: that's completely on the table. As bram-atmire said earlier (I think it was him), it's really unclear at this point whether this is needs to be a *revolutionary* change (re-build it), or if it can be *evolutionary* (find ways to clean up and improve, little by little, what we have)
[21:05] <hpottinger> as an example, when I asked around about improving upload experience for large files last year, a helpful ePrints guy suggested using a WebDav interface.... which would seem to point towards our much beloved LNI
[21:06] <tdonohue> It's not that DSpace is bad (everyone at Chicago still expressed a ton of support and love for DSpace - it's been a great tool through the years)...it's just that it's starting to "show it's age". We need to bring it more up-to-date. That can be done in a variety of ways, and doesn't necessarily mean we have to start over from scratch
[21:06] <mhwood> A number of these problems take a significant amount of work. So who can help us prioritize? We should pick at least one bit one and drive it forward, keeping a list of what's next.
[21:07] <bollini> to be honest we have implemented an ajax upload progress bar for our customers and these has solved our problem with (fake=100Mb) large files
[21:07] <mhwood> I really am not hearing much about up-to-date; what I am hearing is "finished". Configuration is unfinished, in a state that was good enough to get something up and running but not much better. Uploading is unfinished, unable to cope with huge files. Etc.
[21:07] <bollini> if this is really a priority for the dspace community I can check with my boss to find the resource to release it
[21:08] <helix84> to drive my point home - why isn't the ajax upload form in upstream dspace => is it hard to port it back? is there a problem with the process?
[21:08] <hpottinger> bollini ++
[21:08] <helix84> (i mean the process of accepting contributions)
[21:08] <PeterDietz> forest vs trees. There are 100+ institutions that develop solutions to several projects a year. How much of that gets contributed back, how can you facilitate that? What if institutions keep re-inventing features.
[21:08] <tdonohue> mhwood: RE: Prioritizing. Part of the next steps here. Start to clean up the Vision & iterate on it with tech folks & discuss more at OR13 -> Develop a general technical plan (deciding the "how") and iterate on that as needed -> Determine what sort of additional support (more developers / more funding) we may or may not need
[21:09] <robint> bollini: ++ from me too :)
[21:09] <mhwood> tdonohue: PeterDietz. The developer cycles may already be there and engaged, but lacking sufficient communication.
[21:10] <hpottinger> a really interesting problem is data sets: is there any such thing as a "thumbnail" for a data set?
[21:11] <bram-atmire> hpottinger: http://icons.iconarchive.com/icons/mart/glaze/128/binary-icon.png
[21:11] <tdonohue> mhwood: very true. The problem first is to figure out what it is we want/need... then communicate that out and see what resources we may already have "at the ready" and what we need to go find. This is where we (may) possibly be looking towards a "DSpace Steering Committee" (or some sort of Governance) to possibly help us out in the future.
[21:11] <PeterDietz> To us, we want to invest in building a feature, we don't always want to invest in "then cleaning it up, and contributing it upstream", as that takes time away from building the next feature
[21:12] <bollini> hpottinger: we should provide services around data set... for example for csv we can provide search inside, filtering facilities, statistcs
[21:12] <hpottinger> bram-atmire: nice one. For some kinds of data (geospatial) it's natural to just scale back, or show the outside bounds of the box
[21:12] <PeterDietz> We do, contribute things here and there. I would say a rogue group of DSpace developers should go around stealing features from everyones DSpace github repos, polishing them up, and contributing it upstream.
[21:12] <tdonohue> The idea of a "DSpace Steering Committee" is a total brainstorm at this point (came out of the DuraSpace Summit Discussions in Baltimore a few months back). But, several institutions have wondered outloud if establishing such a Steering Committee could help the Committers in prioritizing and also help us find more resources when we find ourselves short on resources
[21:12] <bram-atmire> bollini: second. Integrating Open Refine (old google refine) would be a killer feature
[21:12] <helix84> PeterDietz: I can understand that thinking. Just let me point out a problem with it. Someone else with the same thinking might have already built your next feature and not yet contributed it back.
[21:13] <PeterDietz> I'm not the first one to build citation cover-pages to PDFs
[21:13] <hpottinger> PeterDietz, are you inviting me to poach your github?
[21:14] <bollini> bram-atmire: yes it looks promising but I haven't yet looked to it in details
[21:14] <helix84> tdonohue: count me in for wondering out loud ;)
[21:14] * tdonohue is loving the large amount of discussion we have going on here (and this is our largest meeting in terms of attendance in a while). It's getting hard to even keep up
[21:14] <PeterDietz> hpottinger: I haven't deviated from the base-DSpace license. I think that means you can have a field day with it. And then personally, feel free to be the Robin Hood
[21:14] <mhwood> We do need someone to just keep asking, "what are you all working on that might be generally interesting? Can someone take on this identified priority?
[21:15] * tdonohue will note however that our meeting is in "overtime". I'm not going to call any more agenda items...but, I will be around to continue this discussion for a bit
[21:16] <hpottinger> Okeydoke, I'm already pulling stuff out of OSU's github repository, if I think it might be "generally useful" I may "contribute" it.
[21:16] <hpottinger> anybody else want their code poached while I'm at it?
[21:16] <tdonohue> I'm hoping in general that most of what I have said is coming across somewhat "clear" (at least hopefully it's not all fuzzy chaos in your minds). It's awfully hard to determine that via IRC. But, I take the discussion as good indications here... it's great brainstorms.
[21:17] <aschweer> my biggest hurdle with contributing back is getting the time to clean up my code (mainly to pull out oddities that really are site-specific or to separate several modifications of the same code). licensing might also be an issue
[21:17] <mhwood> I try to design generality in. I'll be the first to admit that it isn't easy.
[21:17] <PeterDietz> My university has a department called "Technology Transfer", which is supposed to help in licensing and commercializing great ideas. Thus, you create a prototype thats really cool, hand it to them, and then they seeing the potential, add some polish to it to make it more generically useful (and monetizable). In open source, this is called spending the extra effort to contribute it back up.. which is effort
[21:18] <helix84> aschweer: can you put up a repo with the site-specific stuff up? someone esle may be interested in cherry-picking it.
[21:18] <hpottinger> all kidding aside, I'm happy to collaborate with any of you, if you've got something cool you want to contribute, show me and I'll see about helping you, or getting you help
[21:18] <bram-atmire> hpottinger maybe you can poach these themes (not mine ;) http://scientific.oceandrilling.org/xmlui/ & https://ufal-point.mff.cuni.cz/xmlui/
[21:18] <kompewter> [ Scientific Drilling Documents ] - http://scientific.oceandrilling.org/xmlui/
[21:18] <kompewter> [ Home ] - https://ufal-point.mff.cuni.cz/xmlui/
[21:18] <aschweer> helix84: I don't know, to be honest. I have had "bring up licensing and contributing back" on my to-do list for over a year
[21:18] <aschweer> (complicated consortium situation)
[21:18] <bollini> I need to leave. Thanks to all, very useful discussions!
[21:19] <bram-atmire> by andrea, and congrats again on your eurocris appointment, very cool
[21:19] <PeterDietz> scientific drilling looks like it should be called "Twitter Bootstrap Theme"
[21:19] * bollini (~chatzilla@pD9E846B1.dip0.t-ipconnect.de) Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[21:19] <bram-atmire> indeed, it's twitter bootstrap on top of dspace
[21:19] <tdonohue> oh interesting
[21:20] <mhwood> I must go as well. Thanks, everyone!
[21:20] <aschweer> I wonder if they got the admin pages bootstrap-ed as well. that was the trickiest part when I tried. plus persuading cocoon to spit out the right doctype (I gave up on both)
[21:20] * mhwood (mwood@mhw.ulib.iupui.edu) has left #duraspace
[21:20] <PeterDietz> I use bootstrap, cocoon needed some coaxing to spit out html5-ish doctype
[21:20] <bram-atmire> zenodo is bootstrap based as well, right?
[21:21] <bram-atmire> https://zenodo.org/
[21:21] <kompewter> [ ZENODO ] - https://zenodo.org/
[21:21] <aschweer> anyway, I need to leave too. thanks everyone!
[21:21] <bram-atmire> (or at least some of the buttons)
[21:21] * aschweer (~schweer@schweer.its.waikato.ac.nz) Quit (Quit: leaving)
[21:21] <helix84> a concluding thought before i leave - dspace has such momentum that although we're not yet sure which way to steer it, it's not going away any time soon
[21:21] <PeterDietz> yeah, their menu / buttons look bootstrappy, and view-source, says yes its bootstrap
[21:22] <tdonohue> helix84 ++ Heard that from many people. The reason we're actually getting this feedback now is that people all *care* about DSpace, and want it to stick around. They don't want to have to (eventually) leave it for something else
[21:24] <helix84> just thought i'll say it for a sleep without worries for anyone here :)
[21:24] <helix84> good night
[21:24] * helix84 (~a@ip4-95-82-147-170.cust.nbox.cz) has left #duraspace
[21:30] * robint (522a6b02@gateway/web/freenode/ip. Quit (Quit: Page closed)
[21:46] * hpottinger (~hpottinge@mu-162198.dhcp.missouri.edu) has left #duraspace
[21:46] <bram-atmire> nighty night
[21:46] * bram-atmire (~bram@94-225-35-170.access.telenet.be) Quit (Quit: bram-atmire)
[21:51] * tdonohue (~tdonohue@c-67-177-111-99.hsd1.il.comcast.net) has left #duraspace
[22:06] * PeterDie_ (~peterdiet@dhcp-140-254-148-230.osuwireless.ohio-state.edu) has joined #duraspace
[22:10] * PeterDietz (~peterdiet@ Quit (Ping timeout: 260 seconds)
[22:11] * PeterDie_ (~peterdiet@dhcp-140-254-148-230.osuwireless.ohio-state.edu) Quit (Ping timeout: 246 seconds)
[22:23] * ksclarke (~kevin@pdpc/supporter/active/ksclarke) Quit (Quit: Leaving.)
[23:10] * ksclarke (~kevin@pdpc/supporter/active/ksclarke) has joined #duraspace
[23:28] * ksclarke (~kevin@pdpc/supporter/active/ksclarke) Quit (Quit: Leaving.)

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.