#duraspace IRC Log


IRC Log for 2011-12-21

Timestamps are in GMT/BST.

[0:24] * eddies (~eddies@unaffiliated/eddies) Quit (Quit: Leaving.)
[0:37] * scottatm (~scottatm@voyager108.evans.tamu.edu) Quit (Ping timeout: 245 seconds)
[3:03] * bradmc (~bradmc@207-172-69-79.c3-0.smr-ubr3.sbo-smr.ma.static.cable.rcn.com) Quit (Quit: bradmc)
[6:35] -hitchcock.freenode.net- *** Looking up your hostname...
[6:35] -hitchcock.freenode.net- *** Checking Ident
[6:35] -hitchcock.freenode.net- *** Found your hostname
[6:35] -hitchcock.freenode.net- *** No Ident response
[6:35] * DuraLogBot (~PircBot@atlas.duraspace.org) has joined #duraspace
[6:35] * Topic is '[Welcome to DuraSpace - This channel is logged - http://irclogs.duraspace.org/]'
[6:35] * Set by cwilper!ad579d86@gateway/web/freenode/ip. on Fri Oct 22 01:19:41 UTC 2010
[8:17] * sbayliss (bcde58ad@gateway/web/freenode/ip. Quit (Quit: Page closed)
[12:03] * bradmc (~bradmc@207-172-69-79.c3-0.smr-ubr3.sbo-smr.ma.static.cable.rcn.com) has joined #duraspace
[12:05] * bradmc__ (~bradmc@207-172-69-79.c3-0.smr-ubr3.sbo-smr.ma.static.cable.rcn.com) has joined #duraspace
[12:05] * bradmc (~bradmc@207-172-69-79.c3-0.smr-ubr3.sbo-smr.ma.static.cable.rcn.com) Quit (Read error: Connection reset by peer)
[12:05] * bradmc__ is now known as bradmc
[14:23] * euler_ (7d3ce3da@gateway/web/freenode/ip. has joined #duraspace
[14:26] * euler_ (7d3ce3da@gateway/web/freenode/ip. has left #duraspace
[14:59] * euler__ (7d3ce3da@gateway/web/freenode/ip. has joined #duraspace
[15:29] * JoeDeVries1 (~jdevries@lib-pclax126.lib.utexas.edu) has joined #duraspace
[15:30] * JoeDeVries1 (~jdevries@lib-pclax126.lib.utexas.edu) has left #duraspace
[19:02] * mhwood (~mhwood@adsl-99-130-166-246.dsl.ipltin.sbcglobal.net) has joined #duraspace
[19:09] * mhwood1 (mwood@mhw.ulib.iupui.edu) has joined #duraspace
[19:13] * mhwood1 (mwood@mhw.ulib.iupui.edu) Quit (Client Quit)
[19:15] * mhwood (~mhwood@adsl-99-130-166-246.dsl.ipltin.sbcglobal.net) has left #duraspace
[19:50] * KevinVdV (~KevinVdV@d54C14B50.access.telenet.be) has joined #duraspace
[19:50] <KevinVdV> Hi all, will there be a meeting today ?
[19:57] * PeterDietz (~PeterDiet@peterdietz.lib.ohio-state.edu) has joined #duraspace
[20:00] * euler__ (7d3ce3da@gateway/web/freenode/ip. Quit (Ping timeout: 258 seconds)
[20:06] * mhwood (~mhwood@ has joined #duraspace
[20:07] * PeterDietz (~PeterDiet@peterdietz.lib.ohio-state.edu) Quit (Read error: Connection reset by peer)
[20:07] <mhwood> Sorry I'm late -- off work, home net is flaky, just figured out the WiFi at McDonalds.
[20:11] <mhwood> Hello, is this thing on?
[20:11] <KevinVdV> I can read you loud & clear mhwood
[20:11] * peterdietz_web (8092ad7a@gateway/web/freenode/ip. has joined #duraspace
[20:11] <mhwood> Do we have enough present to meet?
[20:11] <KevinVdV> I doubt it
[20:11] <peterdietz_web> hi all. I'm having a really flaky network. fixed on using spare machine
[20:12] <peterdietz_web> But.. 1.8.1 was released last week, correct?
[20:13] <mhwood> Apparently there are three here right now.
[20:13] <mhwood> Yes, I believe 1.8.1 did release.
[20:15] <peterdietz_web> ok, two features I've added recently, just to share the word. is that there is a StaticPage class in XMLUI for letting the XSL create a page. Nothing too special, but is now a possibility
[20:15] <peterdietz_web> And then, I've added a method to XMLUI Wing element's for form fields/inputs. So that they can say something like inputField.setAutoFocus("autofocus")
[20:15] <peterdietz_web> ...to make the browser autofocus onto that field.
[20:16] <peterdietz_web> I went through various pages that require user-input, and set the primary/first field in each of them to automatically take the focus.
[20:16] <peterdietz_web> Once you have it, its kind of nice. i.e. you don't have to use mouse to activate the field
[20:17] <mhwood> That will be nice. It takes forever to tab down to where one typically wants to start. I've never had the patience....
[20:18] <KevinVdV> Indeed that would be very nice :) (is the autofocus realse in DSpace 1.8.1. ? Or will it be 3.0 only ?)
[20:19] <peterdietz_web> it didn't get done for 1.8.1, just the 3.x
[20:20] <mhwood> That will be a nice user-visible improvement to highlight for v3.
[20:21] <peterdietz_web> https://jira.duraspace.org/browse/DS-722
[20:21] <peterdietz_web> in some other news, I've been hacking our (local) statistics reports, and made a grid view for showing annual growth. https://plus.google.com/photos/115522466408243604820/albums/5403404115776252449/5688262624885627010?banner=pwa
[20:21] <mhwood> I think static page improvement will be well received as well.
[20:23] <peterdietz_web> The recent email from DCAT said they'd like to see more metadata.. i.e. bitstream metadata, and collection metadata. And the break apart the usage of namespaces. dc.identified.OhioStateUserID gets refactored to a non-dublin-core namespace
[20:23] <mhwood> Yes, there's a JIRA for the latter (not *yours*, but generally).
[20:24] <mhwood> IIRC
[20:25] <mhwood> I've been off work, but tinkering with a way to set up a different (non-pooled) DBMS connection for commandline parts of DSpace. Nothing to show yet.
[20:26] <mhwood> Reason: I have random connection closure problems with commandline bits, and using a pool for a single session is unnecessary complexity.
[20:27] * sandsfish (~sandsfish@dhcp-18-111-9-139.dyn.mit.edu) has joined #duraspace
[20:27] <peterdietz_web> That makes sense. So basically all DB connections go through DBManager, and going through that gives you a pooled connection?
[20:29] <mhwood> Yes. But if I can supply a non-pooled connection through JNDI, I can get around that without touching DBManager.
[20:29] <mhwood> bin/dspace can make that JNDI provider the one consulted for initial context, and DBManager will just use it.
[20:30] <peterdietz_web> Would it perhaps be easier to create another constructor for DBManager that allows you to specify a non-pooled connection... and then alter all SomethingInvokedThroughLauncher.main things to use that non-pooled connection
[20:31] <peterdietz_web> I'm guessing you messing in this area know more than me.. I don't think I've touched JNDI directly
[20:31] <mhwood> We don't go directly to DBManager, though. Getting a handle on an object implicitly spins up the database .
[20:32] <mhwood> I don't recall the precise details, but I didn't see a good way to tell DBManager that it's running single-session.
[20:33] <peterdietz_web> /sidenote Anyone with Ubuntu (development) boxes notice the recent change in availability in distributed java in partner repository? https://lists.ubuntu.com/archives/ubuntu-security-announce/2011-December/001528.html
[20:33] <mhwood> Besides, I'd been wanting to write a JNDI provider and this one si simple enough for a first attempt.
[20:34] <mhwood> I'd been hearing similar things for other distro.s. I've begun trying out OpenJDK but not in production yet.
[20:35] <mhwood> It seems Oracle puts each release under some new variation of the license. I've had enough of fetching and reading new license mutants.
[20:36] <mhwood> That's why I asked a while back about other experience with OpenJDK.
[20:36] <peterdietz_web> yeah.. I was going to stick with my old mantra of just stick it out with sun-java. by some form of fetching it directly and installing it.. But I don't want to manage security updates myself.
[20:37] <peterdietz_web> so it becomes.. what _exactly_ is wrong with openJDK, and is there any chance we'll be able to squeek by on it.
[20:37] * ghghgh (52292725@gateway/web/freenode/ip. has joined #duraspace
[20:37] <peterdietz_web> ...the other thought was to fork DSpace to something non-java
[20:38] <mhwood> Oooh. I think it might be less work to just pitch in on OpenJDK development where we have issues....
[20:39] <peterdietz_web> That would be an interesting A/B profile test to work on. DSpace-A (running SunJava), and DSpace-B (running OpenJDK)
[20:39] <mhwood> I don't want to snip this thread prematurely, but...do we have enough here for a JIRA review, or should we let that wait for a larger group?
[20:42] <peterdietz_web> I'm free to review
[20:42] <mhwood> We should bring up JDK issue in dspace-tech. This is going to affect a lot of sites, and not just Ubuntu sites I'm thinking.
[20:42] <peterdietz_web> sandsfish you around?
[20:43] * ghghgh (52292725@gateway/web/freenode/ip. Quit (Quit: Page closed)
[20:43] <mhwood> https://jira.duraspace.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+DS+AND+resolution+%3D+Unresolved+AND+Key%3E%3DDS-914+ORDER+BY+key+ASC
[20:44] <mhwood> It looks like we left off at Ds-913, so...DS-914.
[20:44] <mhwood> https://jira.duraspace.org/browse/DS-914
[20:45] <peterdietz_web> looks like I've left a comment on this already.
[20:45] <KevinVdV> Well I am always willing to take that one & see it done for DSpace 3.0 unless you want it Peter Dietz ?
[20:45] <peterdietz_web> KevinVdV, I trust your capabilities of doing a good job on this, better than mine.
[20:46] <mhwood> My gut says that alteration of a submission should go back to the submitter, but I see that the comments are against me so far.
[20:46] <KevinVdV> Thanks, I'll see it done !
[20:47] <mhwood> Thank you, KevinVdV.
[20:47] <peterdietz_web> So, I'm keen to spend some time digging in to this, most likely I'll be sure to review your code, etc.
[20:47] <KevinVdV> Don't know WHEN I will find the time to do it though, but I will get it done for DSpace 3.0
[20:47] <mhwood> https://jira.duraspace.org/browse/DS-916
[20:48] <mhwood> It strikes me that we need some status a little less final than "reject", but that's perhaps another issue.
[20:48] <KevinVdV> Seems like we can close DS 916 since it was created by my ignorance :)
[20:49] <mhwood> Ah, I didn't read far enough. If you feel that the issue has been addressed, go ahead and close.
[20:49] <KevinVdV> Closed
[20:50] <KevinVdV> Should have done it a long time ago
[20:50] <mhwood> https://jira.duraspace.org/browse/DS-918
[20:50] <mhwood> Thanks!
[20:52] <mhwood> It looks as though XMLUI just fails to recheck task state when executing a request?
[20:53] <peterdietz_web> concurrent editting.. yikes
[20:55] <peterdietz_web> a->claim_task(taskID)->editMetadata() .... then b->claim_task(taskID) should possibly give a notification that hey, this task is claimed by A, do you want to claim it too?
[20:56] <KevinVdV> Well in the regular workflow 2 people shouldn't be able to claiml the same item (correct me if I am wrong)
[20:57] <mhwood> The issue says that you can't CLAIM concurrently. I think it says that XMLUI is trusting state information contained in the page it sent at some time in the past, and when I think about that it's clearly wrong.
[20:58] <mhwood> The fact that a task was unclaimed when a page was generated says nothing about whether it is unclaimed when a new request comes in.
[20:59] <KevinVdV> So anybody mind if I claim this one also ? Seems like an interesting bug to fix
[20:59] <peterdietz_web> I'm not gonna stop you..
[20:59] <mhwood> Enjoy!
[20:59] <KevinVdV> Great then the glory is mine alone !
[21:00] <mhwood> We're about out of time. Should we stop the review here?
[21:00] <peterdietz_web> sure. I have a SOLR question, that I just want to pose out loud
[21:00] <KevinVdV> Go ahead
[21:00] <mhwood> I can stay for a while. Say on.
[21:01] <mhwood> I still haven't gone beyond the introduction in that SOLR book, though....
[21:01] <KevinVdV> Which book is that mhwood ?
[21:01] <peterdietz_web> My peoples want me to produce top visitors by domain. i.e. top .edu visiting domains.
[21:01] <KevinVdV> (Always interested to learn new solr stuff)
[21:02] <mhwood> may I get back to you on that? I don't recall and it's not here.
[21:02] <peterdietz_web> i.e. 700 from hardvard.edu, 600 from mit.edu, 10000 from osu.edu, ...
[21:02] <peterdietz_web> the problem... is that you can't prefix a query with a wildcard
[21:03] <mhwood> 370 from kcl.ac.uk, so one has to handle multiple suffix lengths.
[21:03] <peterdietz_web> thus your facet can go crawler* to find crawler.msn.com crawler.jeeves.com, etc.
[21:03] <peterdietz_web> but you can't say *.edu
[21:03] <peterdietz_web> or *.osu.edu
[21:05] <peterdietz_web> so.. the question is how would I go about altering my solr schema, to change dns from a string, to something that gets tokenized where each . period splits the token that solr indexes upon
[21:06] <peterdietz_web> bc. I can change the schema, but to make the change affect the data, I got to reindex solr.. {insert ominous music}
[21:06] <mhwood> Completely out of my depth here, but my first guess is you need to write a small extension.
[21:07] <sandsfish> Hey sorry guys, I logged in earlier and didn't see too many people, so I assumed we were off for the holiday. Thanks to the few troopers still here!
[21:07] <mhwood> Glad to have you here.
[21:08] <peterdietz_web> or... I'd like to store DNS information backwards. instead of peterdietz.lib.ohio-state.edu it would index/store it as edu.ohio-state.lib.peterdietz
[21:08] <KevinVdV> Well if you want to split up the DNS fields by . you could always use the PatternTokenizerFactory
[21:08] <peterdietz_web> then I would facet upon edu.FIELD-I-CARE-ABOUT.*
[21:09] <KevinVdV> Flipping would also work ofc
[21:09] <peterdietz_web> are we solr 1.4 or solr 3
[21:09] <mhwood> Reversing the components still needs to tokenize DNS-style, so you have to solve that problem either way. What you do with the results depends on what works best for you.
[21:11] <KevinVdV> http://lucene.472066.n3.nabble.com/Wildcards-at-the-Beginning-of-a-Search-td505007.html
[21:11] <KevinVdV> Something that may be of some use
[21:12] <KevinVdV> https://issues.apache.org/jira/browse/SOLR-218
[21:14] * sandsfish (~sandsfish@dhcp-18-111-9-139.dyn.mit.edu) Quit (Quit: sandsfish)
[21:14] <peterdietz_web> The problem with leading wildcard is similar to looking in the back of a book to find entires that start with sci* versus entries that end with *ence
[21:15] <peterdietz_web> sci, you jump to the S's and find SCI... ence, you have to review every single element in the index, to see if it matches ence.
[21:15] <mhwood> Indirect evidence (rummaging in ~/.m2/repository) suggests that we're using Solr 1.4
[21:15] <peterdietz_web> We have 10M+ documents in Solr stats, extremely impractical to run this query.
[21:16] <peterdietz_web> Kevin, is there anything you've built that would help in reindexing solr stats?
[21:16] <peterdietz_web> i.e. adding Bundle Name, required reindexing
[21:17] <peterdietz_web> so.. my local need of adding field DNS_reverse would do similar.
[21:17] <KevinVdV> Yes I was just gonna suggest that, for the Bundle Name reindexing we dropped the solr index in a csv file & reuploaded these
[21:17] <KevinVdV> https://jira.duraspace.org/browse/DS-599
[21:18] <KevinVdV> You would have to delete the fields before reuploading though
[21:18] <mhwood> Unless this is a one-time report, though, it may still be useful to get Lucene to handle new entries itself.
[21:20] <peterdietz_web> new entries, after i alter schema, and add code to add dns_reverse, should work automatically.. flipping the legacy data, requires the reindex
[21:21] <peterdietz_web> cool. This gives me some direction to tackle this. I don't know if it will be needed by others (dns_reverse). But I'll try to upload a patch once we're done with it.
[21:21] <KevinVdV> Indeed to reindex I would recommend looking into the JIRA I just gave, I looked at it a LOT & this was the fastest way to do it
[21:24] <peterdietz_web> I think that was a very elegant solution. The typical alternative provided by "you need to reindex solr from your data source".. Umm our data source is several gigabytes of dspace.log files, and this reindex takes atleast a week.
[21:24] <peterdietz_web> but... we're past 2100UTC, so I'm guessing we can break off now..
[21:24] <KevinVdV> Well I need some sleep anyways
[21:25] <peterdietz_web> cool, laters ya'll
[21:25] <KevinVdV> If you require some further aid feel free to contact me Peter Dietz, I have a lot of experience with solr & am always willing to share ;)
[21:25] <mhwood> I should leave too, before my battery expires. Thanks, all!
[21:25] <KevinVdV> Ps: Happy holidays !
[21:26] <mhwood> Yes, enjoy the holidays!
[21:26] * KevinVdV (~KevinVdV@d54C14B50.access.telenet.be) Quit (Quit: KevinVdV)
[21:26] <mhwood> I guess I should actually say something like "meeting adjourned."
[21:27] <mhwood> Bye all.
[21:27] * mhwood (~mhwood@ has left #duraspace
[21:34] * peterdietz_web (8092ad7a@gateway/web/freenode/ip. Quit (Ping timeout: 258 seconds)

These logs were automatically created by DuraLogBot on irc.freenode.net using the Java IRC LogBot.