#duraspace IRC Log

Index

IRC Log for 2015-03-18

Timestamps are in GMT/BST.

[6:46] -wilhelm.freenode.net- *** Looking up your hostname...
[6:46] -wilhelm.freenode.net- *** Checking Ident
[6:46] -wilhelm.freenode.net- *** Found your hostname
[6:46] -wilhelm.freenode.net- *** No Ident response
[6:46] * DuraLogBot (~PircBot@ec2-107-22-210-74.compute-1.amazonaws.com) has joined #duraspace
[6:46] * Topic is '[Welcome to DuraSpace - This channel is logged - http://irclogs.duraspace.org/]'
[6:46] * Set by cwilper!ad579d86@gateway/web/freenode/ip.173.87.157.134 on Fri Oct 22 01:19:41 UTC 2010
[7:17] * Jags (3b584fbd@gateway/web/freenode/ip.59.88.79.189) has joined #duraspace
[7:17] <Jags> hi all
[7:17] <Jags> can anyone help me
[7:17] <Jags> regarding dspace search
[7:18] <- *Jags* hi
[7:19] * Jags (3b584fbd@gateway/web/freenode/ip.59.88.79.189) Quit (Client Quit)
[12:10] * mhwood (mwood@mhw.ulib.iupui.edu) has joined #duraspace
[13:00] * peterdietz (uid52203@gateway/web/irccloud.com/x-ozencubasojsdxtr) has joined #duraspace
[13:01] * tdonohue (~tdonohue@c-98-215-0-161.hsd1.il.comcast.net) has joined #duraspace
[15:47] * pbecker (~pbecker@ubwstmapc098.ub.tu-berlin.de) has joined #duraspace
[18:29] * pbecker (~pbecker@ubwstmapc098.ub.tu-berlin.de) Quit (Quit: going home)
[18:40] * hpottinger (~hpottinge@mu-162188.dhcp.missouri.edu) has joined #duraspace
[18:47] * srobbins (~srobbins@libstfsdg02.library.illinois.edu) has joined #duraspace
[18:59] * kohts (~kohts@ppp91-79-215-155.pppoe.mtu-net.ru) has joined #duraspace
[19:56] * mhwood (mwood@mhw.ulib.iupui.edu) has left #duraspace
[20:02] <tdonohue> Hi all, it's time for our weekly DSpace Developers meeting. https://wiki.duraspace.org/display/DSPACE/DevMtg+2015-03-18
[20:02] <kompewter> [ DevMtg 2015-03-18 - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2015-03-18
[20:03] <tdonohue> The agenda today looks a bit "long" as the first topic is a summary of the discussions from last week's "DuraSpace Summit" (especially from the DSpace project breakouts)
[20:03] <tdonohue> I'm not going to dig deeply into each of my bullets in the agenda, but I'd encourage you to take a moment to read through them (if you haven't already)
[20:04] <tdonohue> The main points here are that: (1) Everyone agreed that having 2 UIs (JSPUI & XMLUI) is wasteful, and we need to work towards converging on a single UI
[20:05] <tdonohue> (2) DSpace Steering committee is tasking a small working group (including myself) to start to draft up a "DSpace Product RoadMap" (which will likely include how we can start to work towards converging on a single UI)
[20:05] <tdonohue> (3) That Product RoadMap will be presented in draft form for feedback at OR15
[20:06] <tdonohue> (4) DSpace Steering Committee also wants to try to enhance our marketing of DSpace (forming a Marketing Working Group to do so), and do some more extensive fund raising this year in order to try to hire a Product Manager (to work alongside myself, as the Tech Lead)
[20:07] <tdonohue> Those are the major themes / discussions that came out last week...more details in the agenda summary
[20:08] <tdonohue> I'll do my best to keep everyone up-to-date on all these various topics...there will also be a lot more discussion & presentations (especially on the upcoming Product RoadMap) at OR15
[20:08] <tdonohue> If there are any questions, concerns or comments here, I'd be glad to answer them or help clarify anything
[20:09] <hpottinger> so the product manager idea is directly tied to getting more funding?
[20:10] * tdonohue will pause here for a moment, in case anyone is typing. I'm also glad to answer questions offline as needed
[20:11] <hpottinger> and the road map process, has there been a change there?
[20:11] <tdonohue> hpottinger: yes, we cannot hire any a Product Manager without more funding. My role as Tech Lead (as well as a Product Manager role) are dependent on Membership and/or Fundraising. Last year (2014), we raised enough funds via the Membership Drive to cover me (at ~70% time), but *not* to cover a Product Manager
[20:12] <tdonohue> As for as the Product RoadMap is concerned, there's been no significant changes (as before this meeting we already had plans for a rough draft Product Roadmap to be presented at OR15). The only slight change is that the issue of two UIs was noted as a big concern, and everyone seemed to agree it will need addressing in the Product RoadMap
[20:13] <hpottinger> I think there was talk of hiring a consultant to help with drafting the road map?
[20:14] <hpottinger> (I'm not trying to be difficult, I just know I have some people here who will ask)
[20:15] <tdonohue> Yes, the hiring of a consultant has changed (thanks for reminding me)....there's been a slight "wrench" in that process, as the consultant we were talking with may no longer be available. We're now working towards getting this work done via a "RoadMap Working Group" on the same schedule (but without that consultant)
[20:15] <tdonohue> As a replacement to the consultant, I'm working on freeing up more of *my* time in the coming months, so that I can help to lead that "RoadMap Working Group"
[20:16] <hpottinger> tdonohue: how can we help?
[20:17] <tdonohue> mostly "be available for feedback" (as the process gets started). This is going to be a *very rapid* process, so the RoadMap Working Group is intentionally a small group (with representatives from Committers, DCAT, Service Providers, etc).
[20:17] <hpottinger> so, perhaps it's as simple as: please hang out in IRC? :-)
[20:17] <tdonohue> But, the RoadMap Working Group does plan to try and give opportunities for feedback to especially the Steering Group, Committers & DCAT (and anyone else who wants to help)....we hope to have at least one comment period prior to OR15...but we'll have many more at OR15 and likely even beyond then
[20:18] * aschweer (~schweer@schweer.its.waikato.ac.nz) has joined #duraspace
[20:18] <tdonohue> Yep..."please hang out in IRC"..."please comment via email (as we start to send out very early drafts)", etc.
[20:19] <tdonohue> I anticipate most of the Product Roadmap work will likely be done in "public" (on the wiki) anyhow...so folks can likely just watch the page (once we get one setup) and add comments as we go
[20:19] <hpottinger> sounds good to me
[20:20] <tdonohue> Any other questions, concerns, clarifications, comments on the Summit discussions?
[20:20] <hpottinger> would it be too much to organize that work into "sprints?" even though it's not really coding
[20:21] <tdonohue> hpottinger: good thought...yea, it's not really coding, but the sprint idea still might ensure we stay on track. I like it. I'll mention that to the group as we get started (we haven't started yet)
[20:22] <aschweer> the sprints (with posted start/end dates) might also help get eyes on the work, because then we can pick and choose those topics we feel most confident with / that are most relevant locally
[20:24] <tdonohue> I'm doubting we have enough time between now and OR15 to *exactly detail* which topics will be discussed in each sprint...but we'll see. The DCAT gathered use cases will be the "starting point" for this Product RoadMap though, so we might be able to do "high level" topic sprints (Admin vs End User vs Submit, etc)
[20:24] <tdonohue> https://wiki.duraspace.org/display/DSPACE/Use+Cases
[20:25] <kompewter> [ Use Cases - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/Use+Cases
[20:25] <aschweer> tdonohue yes that's the type of breakdown I think would be helpful
[20:25] <peterdietz> Is there any other notes from the summit, is this was most of it?
[20:25] <tdonohue> cool, yea, I'll see what I can do then, aschweer. I like the idea
[20:25] <hpottinger> Can Confluence/Jira work a bit more like Trello?
[20:26] <tdonohue> peterdietz: this was the summary of the "DSpace Breakout sessions" (from Day 2 of the summit). Day 1 of the summit was mostly presentations (all of which were recorded & will be public sometime soonish) on DuraSpace and each of the Projects (DSpace, Fedora, VIVO)
[20:27] * hpottinger answers his own question: yes, with Greenhopper
[20:27] <tdonohue> I didn't take notes on the Day 1 stuff, to be honest, as it all was more of an overview of the last year, and where each project is going. Oh, and a portion of Day 1 was the "Introducing the new DuraSpace CEO, Debra Hanken Kurtz" ;)
[20:28] <tdonohue> I do know that additional summaries/reports from the summit will be posted to the DuraSpace blog in the next week or so...this was just my personal notes from the DSpace discussions
[20:29] <tdonohue> Any other questions or comments on all this? Glad to answer them (now or later on)
[20:30] <peterdietz> Cool. So UI may or may not be the hottest topic at the moment. And a roadmap will help us guide future development
[20:31] <tdonohue> peterdietz: UI is one of the bigger topics in that it affects our Product Roadmap (and everyone sees it as a waste of current developer/committer resources to be building the same feature "twice"). And yea, the Product Roadmap's role is to try to guide future development and help organize all of us around a common "path" for DSpace (so we can avoid duplicative work as much as possible)
[20:33] <tdonohue> Ok, sounds like discussion is slowing here...so we may as well move on to other topics. As mentioned, I'll try to keep everyone posted on Product RoadMap status (as soon as we get kicked off). Let me know if any other questions come up.
[20:34] <peterdietz> Okay. I'm thinking about how REST may or may not have been an example of how there were divergent routes to get there, and we eventually rallied around one "blessed" REST API, and that there has been consistent support on that feature
[20:35] <hpottinger> so, I'm assuming that organizing the use cases will be a big part of what the Road Map will tackle?
[20:35] <peterdietz> So, either have one of the current UI's get a blessing, then get overhauled to be the best choice in 2015 and beyond, or someone taking on a huge role and building a new UI, that people can get behind
[20:35] <hpottinger> peterdietz: that's actually the story of the REST-API right there, truth be told
[20:35] <tdonohue> hpottinger: yep. The Product RoadMap team will initially be analyzing the use cases in more detail, trying to prioritize them & organize them.
[20:36] * mhwood (~mhwood@adsl-99-130-170-8.dsl.ipltin.sbcglobal.net) has joined #duraspace
[20:36] <peterdietz> Any route, one with strong feelings can still use their alternative (the other REST implementations, other UI implementations), but your mostly on your own.
[20:38] <hpottinger> tdonohue: that's kind of why I was asking about Trello, as it has a nice interface for organizing cards/use cases/stories
[20:38] <hpottinger> perhaps we can fire up Greenhopper and put all the use cases in it?
[20:38] <tdonohue> peterdietz: yes, the UI path may be similar to the REST path. I think the issue here is that solving the two UI problem may be necessitated by the Product RoadMap (in that in order to tackle some use case or features, we need a UI decided upon...otherwise, we are duplicating effort again)
[20:39] <tdonohue> hpottinger: Greenhopper might be a good plan here. I admit, I haven't played with it much (yet). But I know some projects use it more heavily (e.g. DuraCloud)
[20:40] <mhwood> One difference is that very few were using REST early on. EVERY SITE is using a UI.
[20:41] <hpottinger> mhwood: hopefully we see some convergence with Bootstrap stuff
[20:41] <hpottinger> just in time to change everything all over again! :-)
[20:42] <tdonohue> yep, true. The UI problem is not a "small one". It'd require a migration path for potentially *every* user of DSpace. We initially made sure that our "Product Roadmap Working Group" has representatives for both XMLUI and JSPUI, etc
[20:42] <tdonohue> *intentionally* (not initially)
[20:43] <tdonohue> Any other thoughts/comments? Not wanting to stop the discussion here...just trying to gage if it's time to move on to other topics
[20:45] <tdonohue> Ok, all sounds quiet. So, we may as well move on to other topics for today.
[20:46] <tdonohue> Next topic - DSpace 5.2. I'm aware that we have some Solr Statistics index issues that need resolution soonish...So, just curious how that's going, and whether there's any other major issues warranting a 5.2 in near future?
[20:46] <aschweer> I can give a status update on the Solr issues
[20:47] <mhwood> Please.
[20:47] <tdonohue> go ahead please :)
[20:47] <aschweer> I think we need a way to reindex all solr documents, ideally one that works for all our solr-as-main-datastore cores (statistics + authority)
[20:47] <hpottinger> +1
[20:47] <aschweer> I see two approaches. 1) Reindex in place - this is what Terry Brady's code is doing
[20:47] <aschweer> 2) Export all solr docs, empty out the core, re-import all solr docs
[20:48] <aschweer> I've tried (1) but at least my code doesn't work too well, so I've started on (2) but have had other fires to put out so haven't got very far
[20:48] <tdonohue> I think #2 (export all solr docs) is necessary anyways for a safe "backup" (even if we had a "reindex in place" option)
[20:48] <aschweer> tdonohue agreed
[20:49] <tdonohue> So, in my opinion, #2 is required. #1 might be "nice" (cause it's likely quicker)
[20:49] <mhwood> Also less disruptive.
[20:49] <aschweer> not sure about quicker, #1 took 2 hours on 6.6M solr docs (and didn't fix all the problems)
[20:49] <hpottinger> +1 way to back up and restore stats would be great
[20:49] <mhwood> We can probably provide #2 rather quickly, and gain time to work on #1.
[20:49] <tdonohue> +1 mhwood
[20:50] <aschweer> I'm a little stumped on some of the behaviour I'm seeing with my #1 code
[20:50] <mhwood> Yes?
[20:50] <aschweer> just a reminder, for now what we need to do is (a) add uids to all stats docs from pre DSpace 3 and (b) reindex all stats docs from pre DSpace 5 so that they use docValues for those fields where that's enabled
[20:51] <aschweer> My code for the reindexing achieves (b) but not (a). Terry's code looks pretty similar to mine, it achieves (a) but achieves (b) only for those docs that are affected by both problems
[20:51] <hpottinger> just to give context: DS-2487
[20:51] <kompewter> [ https://jira.duraspace.org/browse/DS-2487 ] - [DS-2487] Pre-5 Solr usage stats geo information may get lost when upgrading to 5 - DuraSpace JIRA
[20:51] <mhwood> Sounds to me like we need two, uh, "migrations".
[20:51] <aschweer> so essentially, I don't know why his code appears to add the uids but mine doesn't (I'm sure I might just be a bit blind to the details at the moment)
[20:51] <aschweer> in theory, one migration can achieve both
[20:51] <aschweer> reindex all docs, add uid where needed
[20:52] <aschweer> reindexing achieves (b) all by itself, nothing needed
[20:52] <mhwood> OK.
[20:52] <aschweer> thanks for the ds link; Terry's code is here: https://gist.github.com/terrywbrady/82bd91b53ea4374b96e4
[20:52] <kompewter> [ Copy DSpace Statistics Records to Force UID generation ] - https://gist.github.com/terrywbrady/82bd91b53ea4374b96e4
[20:53] <aschweer> I have my first attempt at #1 in a gist somewhere too, but it isn't at all up to date
[20:53] <aschweer> and I will need to leave in 40 minutes to go to a funeral, won't be back till tomorrow
[20:53] <aschweer> so, I think we should maybe focus on #2 anyway
[20:53] <aschweer> there is an approach like that in the SolrLogger sharding code
[20:54] <aschweer> however, that one has a bug at the moment because it doesn't strip out the _version_ fields, see DS-2212
[20:54] <kompewter> [ https://jira.duraspace.org/browse/DS-2212 ] - [DS-2212] Statistics Shard not working, version conflict - DuraSpace JIRA
[20:54] <aschweer> also, there are smarter ways of reading solr responses into a csv
[20:54] <tdonohue> Yea, it seems like if we can get a #2 option done quickly, we could even potentially release a 5.2 with that (if the #1 option takes more work/analysis, it could wait for a 5.3)
[20:55] <aschweer> (better ways than in Terry's patch for 2212, I mean)
[20:56] <aschweer> FileUtils.copyURLToFile(queryURL, file)
[20:56] <aschweer> then use the code as in the sharding to read it back in, stripping the _version_ field
[20:57] <aschweer> the issue with the export-dump-import approach is writing code that isn't ugly and works for everyone
[20:58] <aschweer> you probably want to throw a sort clause into the query used for exporting. but there isn't a common field between statistics and authority that would work (time for statistics, creation or modification date for authority)
[20:58] <aschweer> also, I don't know whether special magic is needed in the import to add the uids
[20:58] <aschweer> and of course we don't want to add uids to the authority core
[20:58] <aschweer> plus, the core names are configurable, as is the location of the solr server
[20:58] * kohts (~kohts@ppp91-79-215-155.pppoe.mtu-net.ru) Quit ()
[20:59] <hpottinger> I am available to help test and can contribute whatever meager amount of brain power I have available, DS-2487 is a blocker for our upgrade to 5
[20:59] <kompewter> [ https://jira.duraspace.org/browse/DS-2487 ] - [DS-2487] Pre-5 Solr usage stats geo information may get lost when upgrading to 5 - DuraSpace JIRA
[20:59] <tdonohue> So, I'll admit I'm not sure whether I'll have much time to spend on coding/analysis in this problem (as much as I'd like, I'm rapidly trying to clear my "plate" to take on this Product RoadMap work).
[20:59] <tdonohue> But, I'm a very willing tester. I have mostly 4.x instances to test with though, so I only can easily test the docValues part.
[21:00] <aschweer> it is a high priority for our clients, since it is a blocker here too. But as I said, I'm out all of today.
[21:00] <mhwood> The Solr instance's URL probably should be taken from the normal configuration, without asking.
[21:00] <hpottinger> I've got a 3x-era Solr stats core just waiting to be upgraded
[21:00] <aschweer> cool. Just remember, you can take a copy of the stats core so you can do multiple test runs on the same data
[21:01] <aschweer> mhwood yes, probably. My code currently takes a flag to say which index to work on
[21:01] <hpottinger> indeed, I'm testing on a copy
[21:01] <tdonohue> I think the important piece here is just to keep this moving along, and as we get closer to a "resolution" we'll need to set more specific dates for a 5.2 release. This is definitely a blocker for 5.2, but we don't have a schedule for 5.2 until we can get closer to resolution
[21:02] <aschweer> I admit that I'm currently trying to get this working first in a way that helps my clients. So there may be opportunity in a little while for taking my code and making it more generic
[21:02] <aschweer> that sounds good
[21:02] <aschweer> in the meantime, if anyone has an idea, please shout out -- probably best place is on the devel list
[21:02] <mhwood> We're about to go 5.x here, so I'll need the fix too. If there's something I can do, ask.
[21:02] <tdonohue> yep, I agree -devel or JIRA is the best place for these discussions.
[21:02] <aschweer> has anyone heard from @mire / CINECA (as the people who made the schema changes) whether they perhaps have local fixes hanging around for this?
[21:03] <tdonohue> I have not, aschweer. You might even want to ping them directly (Bram & Andrea)...and possibly even CC in -commit if you want
[21:03] <aschweer> cool, I'll do that when I get the chance
[21:03] <tdonohue> I'd expect they'd encounter this at some point as well, and they should be motivated to get it fixed ;)
[21:04] <tdonohue> thanks for all your work on this, aschweer!
[21:04] <mhwood> Yes, thank you.
[21:04] <aschweer> I'm guessing it is too much to hope for that they have the fix already and it just needs to be shared :)
[21:04] <aschweer> thanks guys :)
[21:04] <hpottinger> oh, hey, tdonohue, there's a new way to ping, isn't there?
[21:04] <tdonohue> It'd be nice if they did have a fix, but my guess is that they do not
[21:05] <tdonohue> Oh, yes, in JIRA tickets, you can type "@" in a Comment, and pull up someone's name. It'll send them an email letting them know you mentioned them on that Ticket
[21:05] <tdonohue> So you could ping Bram & Andrea on that JIRA ticket (2487)
[21:05] <aschweer> oh oops I didn't know it spams people, I've been making liberal use of that for Terry to make sure he gets credit for his work too :)
[21:05] <tdonohue> (and the others that are related)
[21:06] <tdonohue> aschweer: I actually *like* that it emails folks separately....it allows us to more easily ping someone who may not have noticed the ticket in their inbox (or mail filters)
[21:06] <hpottinger> speaking of spam, I'm about to add an at-ping comment to DS-2487
[21:06] <kompewter> [ https://jira.duraspace.org/browse/DS-2487 ] - [DS-2487] Pre-5 Solr usage stats geo information may get lost when upgrading to 5 - DuraSpace JIRA
[21:06] <aschweer> tdonohue: yes it totally makes sense, I just didn't think about it
[21:07] * tdonohue is going to have to go shortly, and we're "over time" here
[21:08] <tdonohue> Any other last thoughts for today? It seems like we are mostly wrapping up anyhow
[21:08] <aschweer> I guess one lesson learned from this is that we should check PRs for changes to the solr schema, and make sure there is an upgrade path
[21:08] <mhwood> Argh, I did promise to bring up the question of whether, in the longer term, we should be using Solr as a primary store at all. Maybe carry over to next week?
[21:08] <hpottinger> oh, mhwood, you did start a thread on -devel
[21:09] <aschweer> yes, good point, and if tdonohue has to go (and I have to go soon too), then postponing sounds good. Though of course I'll be fast asleep at next week's meeting :)
[21:09] <tdonohue> The question of "solr as a primary store" is basically this same topic (as far as I'm concerned). I agree, it's not ideal..but if we have a good "backup & restore" solution (as we just talked about), it's not as big a concern
[21:10] <mhwood> True.
[21:10] <tdonohue> ideally, we'd have a better place to store this info "permanently", but the DB is not an ideal place to put access/download stats...so, maybe just having a way to backup your stats to CSV is "good enough"?
[21:10] <hpottinger> right, that's the concern: it's very ephemeral now
[21:10] <aschweer> I agree that it doesn't have to be bad, as long as we give people ways to manage the data
[21:10] <mhwood> Actually the question "should we use it at all" is a subset of the question I posed on the ML: "are we using it well?"
[21:10] <aschweer> ideally I suppose we'd like an incremental export option too, so people can run this on say a weekly basis
[21:11] <aschweer> and I've said this before, but just so it doesn't get lost, ElasticSearch is in the same category as Solr in this regard, so ElasticSearch stats are affected too (just I'm guessing used by fewer people)
[21:12] <tdonohue> I'm sure we'll still have more to talk about on this topic next week too, so I'll leave on the topic of "Solr as a Primary Store". But, I do think it's directly related to what we've discussed today as well :)
[21:13] <tdonohue> At this point, I unfortunately do need to head out. So, I'm going to have to close up the meeting now. I'll check back in on the logs later though (so feel free to continue to discuss things amongst yourselves)
[21:13] <tdonohue> bye
[21:13] <kompewter> bye!
[21:13] <mhwood> So the question is "how should we store event records securely as they accumulate forever, in such a way that we can run useful queries with reasonable efficiency, and feed the results to tools of choice (DSpace builtin stuff and external tools like R or Excel)?"
[21:13] <aschweer> bye tdonohue
[21:14] <aschweer> I don't hate on solr for this, so I'm not sure we need anything "better" at this point
[21:14] <hpottinger> mhwood: and, do we even need to be in this business, or are there other methods of storying and analyzing usage data which we can/should take advantage of?
[21:14] <aschweer> we should try and get the Europeans to weigh in though, they may have some perspectives on stats shaped by their privacy laws
[21:14] <mhwood> Point, aschweer.
[21:14] <aschweer> ie, they might need us to store aggregated stats for visits older than a certain time, etc
[21:14] <hpottinger> s/storying/storing/
[21:14] <kompewter> hpottinger meant to say: mhwood: and, do we even need to be in this business, or are there other methods of storing and analyzing usage data which we can/should take advantage of?
[21:15] <aschweer> well Robin Taylor pointed out taht there are now ways to outsource stats to GA entirely, with the download-as-GA-event code and the GA stats inclusion in the user interface
[21:15] <aschweer> that just doesn't help us for historical data
[21:15] <aschweer> so I'm guessing yet again we are looking at use cases
[21:16] <hpottinger> Logstash is an option, probably
[21:16] <aschweer> my repo managers certainly brought out the pitchforks mentioned by someone else on -devel when I mentioned that historical stats data is in danger
[21:16] <aschweer> logstash just uses elasticsearch for its data, no?
[21:16] <mhwood> How well does GA support *exploratory* statistics? (Every time I crank out a new report, they discover that they want something substantially different too.)
[21:17] <hpottinger> aschweer: 'tis their business what they use, I'm just wondering if *we* need to be in this business
[21:17] <aschweer> we also need to keep COUNTER/Sushi/whatever it's called in mind, ie, if that is what our user base wants to use then DSpace should be able to play nicely with that
[21:17] <mhwood> I suspect that we need to be in the business of preserving the event data. I'm not at all sure that we need to be in the business of analyzing them.
[21:17] <aschweer> hpottinger: yeah fair point
[21:18] <aschweer> mhwood++, but with the caveat that parts of our user base may be in legal trouble for storing the event data in non-aggregated form
[21:18] <hpottinger> I'm thinking usage data isn't a very interesting problem to me, if someone else is solving it, I'll let them
[21:18] <mhwood> Yes, so what we store must be configurable in useful ways.
[21:18] <aschweer> I'm not up to date on EU laws but I know that data retention and the like are big topics there, have been for years
[21:18] <mhwood> And we should be sensitive to EU laws (and others).
[21:19] <aschweer> so as I said, we're back to use cases. What information about usage do we agree that people should be able to get out of DSpace. We can't answer that between the three of us.
[21:19] <mhwood> True. I'll try to remember to bring this up again next week.
[21:20] <aschweer> Interesting look at the repo of the future really. The OAIS model doesn't talk about usage stats, and they certainly aren't the primary function of the repository. But they appear to be pretty important to our users in reality.
[21:20] <aschweer> mhwood that'd be great, thanks
[21:20] <hpottinger> mhwood was right about counter, that's pretty much *the* standard for "what usage data people want"
[21:20] <mhwood> That was not me. But I agree that it is important.
[21:21] <aschweer> I agree that it's important but I don't know enough about it to contribute usefully. Another reason to make sure the Europeans get their say, I guess.
[21:21] <hpottinger> oh, shoot, sorry, aschweer brough up counter
[21:21] <aschweer> all good :)
[21:22] <aschweer> (and it feels really weird to point to the Europeans as "the others", but really, I'm way out of touch with what's going on there)
[21:22] <hpottinger> the point is we don't need to invest time researching that use case, it's been done
[21:22] <aschweer> yes, that is a good point.
[21:22] <aschweer> assuming the rest of the world does agree with COUNTER ;)
[21:22] <hpottinger> NISO does a good job with that stuff, I think?
[21:23] <hpottinger> http://www.projectcounter.org/about.html
[21:23] <kompewter> [ COUNTER | About Us ] - http://www.projectcounter.org/about.html
[21:24] <aschweer> Anyway, sorry guys, I really need to head off now. I will read up on counter but probably not in the very near future...
[21:24] <mhwood> We'll keep in touch.
[21:24] * aschweer (~schweer@schweer.its.waikato.ac.nz) Quit (Quit: leaving)
[21:25] <mhwood> Does that about wrap things up for now?
[21:26] <hpottinger> a quick scan shows Counter members to come from a number of countries, and, yeah, I think we're done
[21:27] <mhwood> OK, signing off.
[21:30] * hpottinger (~hpottinge@mu-162188.dhcp.missouri.edu) Quit (Quit: Leaving, later taterz!)
[21:30] * mhwood (~mhwood@adsl-99-130-170-8.dsl.ipltin.sbcglobal.net) has left #duraspace
[21:39] * tdonohue (~tdonohue@c-98-215-0-161.hsd1.il.comcast.net) has left #duraspace

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.