[20:02] * luizsan (~luizsan@ Quit (Ping timeout: 244 seconds)
[20:03] * luizsan (~luizsan@ has joined #duraspace
[20:03] <tdonohue> Sorry I'm a few minutes late (doorbell ring at the wrong time). But, it's time for our weekly DSpace DevMtg!
[20:03] <tdonohue> https://wiki.duraspace.org/display/DSPACE/DevMtg+2016-06-22
[20:03] <kompewter> [ DevMtg 2016-06-22 - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2016-06-22
[20:04] <luizsan> hi
[20:04] <tdonohue> Obviously last week was OR16, and it was wonderful seeing everyone (who was there)!
[20:04] <tdonohue> Hi luizsan
[20:05] <tdonohue> I did post up the meeting notes from the OR16 DSpace Developer / DCAT meeting on the wiki (and slides, at least those I have so far): https://wiki.duraspace.org/display/DSPACE/DevMtg+2016-06-13+-+OR16+Meeting#DevMtg2016-06-13-OR16Meeting-MeetingNotes
[20:05] <kompewter> [ DevMtg 2016-06-13 - OR16 Meeting - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DevMtg+2016-06-13+-+OR16+Meeting#DevMtg2016-06-13-OR16Meeting-MeetingNotes
[20:07] <tdonohue> I've also been told that most sessions were recorded at OR16 (after workshop day, obviously). So, recordings should be shared around publicly in the next week or so, for anyone who missed seeing a particular talk
[20:07] <tdonohue> My slides from the new UI talk have also been posted up (and that session was also recorded): http://www.slideshare.net/tdonohue/introducing-the-new-dspace-user-interface
[20:07] <kompewter> [ Introducing the New DSpace User Interface ] - http://www.slideshare.net/tdonohue/introducing-the-new-dspace-user-interface
[20:08] <luizsan> Good! I want to see that records.
[20:08] <tdonohue> Anyone else have something to share or say about OR16? Wanted to leave some time for any other "wrap up tasks". I will note that the new UI work will proceed soon...but I feel we need to really get 6.0 wrapped up and out the door (as a much higher priority)
[20:09] <terry-b> The opportunity to converse with several DSpace users was awesome
[20:10] <hpottinger> my recap: http://hardyoyo.thebignow.com/article/53/or2016-dublin-recap
[20:10] <kompewter> [ article: hardyoyo: OR2016 Dublin Recap ] - http://hardyoyo.thebignow.com/article/53/or2016-dublin-recap
[20:11] <tdonohue> aha, yes, thanks for sharing that hpottinger :)
[20:11] <hpottinger> I just added a link to your slides
[20:12] <luizsan> I have a question, do you guys already know where gonna be the ext one?
[20:13] <tdonohue> Ok, so not hearing any other specific OR16 topics for now. I will keep everyone in the loop on UI activities and plans to "ramp that back up". As of yet, my concentration is moving back to 6.0 to help try and wrap that up. But, will be helping get planning started for new UI work right after 6.0
[20:13] <hpottinger> luizsan: http://or2016.net/save-date-or2017/
[20:13] <tdonohue> luizsan: yes, the next OR conference will be in Brisbane, Australia
[20:13] <kompewter> [ Save the date - OR2017 - OR 2016 ] - http://or2016.net/save-date-or2017/
[20:14] <hpottinger> Oh, an OR16 thing: we have work to do with filtering out robots more effectively
[20:14] <luizsan> thanks, too far away.
[20:15] <hpottinger> I'll make a Jira issue re: bot filtering improvements, I'm kinda waiting for Joseph Greene's slides to be linkable
[20:15] <tdonohue> hpottinger: yes, I missed that talk. If there are actions out of it, it'd be good to create some JIRA issues ;)
[20:17] <hpottinger> his article is in Research Repository UCD, but it's under embargo until after it's published :-(
[20:17] <hpottinger> http://hdl.handle.net/10197/7682
[20:17] <kompewter> [ Web robot detection in scholarly Open Access institutional repositories ] - http://hdl.handle.net/10197/7682
[20:17] <tdonohue> Crud. I'm completely sorry, but I'm going to need to step away....I've got a basement flooding situation from storms today (not rapidly but still doing so) that I need to get a handle on ASAP.
[20:17] <luizsan> tdonohue: In your presentation, you said the REST-API need developer, how can I contribute? Jira?
[20:17] <tdonohue> Could I ask someone to take over? Perhaps just continue reviewing 6.0 PRs in this list (and assigning/ testing / volunteering): https://github.com/DSpace/DSpace/milestones/6.0
[20:17] <kompewter> [ Issues · DSpace/DSpace · GitHub ] - https://github.com/DSpace/DSpace/milestones/6.0
[20:18] <hpottinger> we have this, tdonohue, go fix your flood
[20:18] <mhwood> There is already a network of "fix problems with robot detection" issues...
[20:19] <hpottinger> mostly we need a tool or UI to help with discovering and pruning "new" robots, I found quite a few when I looked earlier today
[20:20] <mhwood> DS-2431 is one.
[20:20] <kompewter> [ https://jira.duraspace.org/browse/DS-2431 ] - [DS-2431] dspace stats-util --mark-spiders doesn&#39;t use name or agent patterns - DuraSpace JIRA
[20:20] <hpottinger> oh, yes, that
[20:21] <hpottinger> we could also try to add a heuristic for "large number of downloads + rapid rate of requests = robot"
[20:22] <luizsan> hpottinger: Something like a servletfilter monitoring the http request?
[20:23] <mhwood> And then 2000 DSpace sites all individually update their tables. Or don't. We need a way to share this information.
[20:24] <mhwood> luizsan: we already log enough information. We just need to make better use of it.
[20:25] <hpottinger> I was thinking I might be able to fashion a Solr query to identify request rate by comparing timestamps from the same IP
[20:26] <luizsan> mhwood: A kind of robot monitoring central? Where the DSpace around the world could put robot info?
[20:26] <mhwood> Maybe not limited to DSpace.
[20:26] <hpottinger> +1 bot ID sharing, I've found a few more domains to add to our existing examples
[20:27] <luizsan> mhwood: it could be huge.
[20:27] <hpottinger> https://gist.github.com/hardyoyo/7c97314945b8e52abf9a10cc3dd8ecd2
[20:27] <kompewter> [ additional_robot_domains_for_dspace.txt · GitHub ] - https://gist.github.com/hardyoyo/7c97314945b8e52abf9a10cc3dd8ecd2
[20:27] <mhwood> At OR we talked briefly about putting a DNS blackhole service in front of the shared database. Of course, we'd have to find a way to fund the service.
[20:28] <hpottinger> right, something like Akismet for Wordpress
[20:29] <luizsan> hpottinger: I don't know what Akismet does.
[20:30] <hpottinger> https://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists
[20:30] <kompewter> [ Comparison of DNS blacklists - Wikipedia, the free encyclopedia ] - https://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists
[20:30] <hpottinger> https://wordpress.org/plugins/akismet/
[20:30] <kompewter> [ Akismet — WordPress Plugins ] - https://wordpress.org/plugins/akismet/
[20:31] <hpottinger> The gist of Akismet is it's a shared DB for all Wordpress sites, the goal is to eliminate comment spam
[20:31] * mhwood wishes the big search engines would agree on a request header that mean's "I'm a good robot". Then we could concentrate on the bad bots.
[20:32] <hpottinger> well... mostly we want to drop all bot hits from our stats, good or bad doesn't make any difference
[20:33] <mhwood> But if a bot admits it is a bot then we can just mark that access as such. No work. Requests without the header get further examination.
[20:34] * Dylan (~Dylan@ has joined #duraspace
[20:34] <hpottinger> the "good" bots do self-identify... it's the custom ones (machine interfaces = more bots) we need to worry about
[20:34] <luizsan> kompewter: thanks for the link
[20:34] <mhwood> Plus we might want to count bots, but categorize accesses: "Accesses in the last 24 hours: 1032 probable human, 4344 Google, 10443278 Baidu...."
[20:35] <hpottinger> I tag with isBot and then drop those rows pretty regularly
[20:35] <luizsan> mhwood: And what about bot in the REST-API?
[20:36] <mhwood> Good question. We should be logging content access, at least, but I don't know if we do.
[20:37] <luizsan> hpottinger: "pretty regulary" means with hight level of requests per second?
[20:38] <hpottinger> "pretty regularly" = with a cron job
[20:38] <luizsan> hpottinger: Ok I got it
[20:38] * Dylan (~Dylan@ Quit (Ping timeout: 276 seconds)
[20:41] <hpottinger> OK... so.. I volunteer to write a ticket or two based on Joseph Greene's talk "I Can Haz Robot?"
[20:42] <mhwood> That sounds lovely. Thanks.
[20:45] <hpottinger> next?
[20:45] <hpottinger> actually, do we have a "quorum" any more? is it just us two?
[20:46] <luizsan> I'm here!
[20:46] <luizsan> three of us?
[20:47] <hpottinger> :-) we typically need three committers for voting purposes
[20:48] <luizsan> This is one of my questions, committer is anyone who already commit to DSpace code or there is another thing?
[20:48] <terry-b> I am here.
[20:49] <hpottinger> https://wordpress.org/plugins/akismet/
[20:49] <kompewter> [ Akismet — WordPress Plugins ] - https://wordpress.org/plugins/akismet/
[20:50] <hpottinger> darn it... one moment
[20:50] <hpottinger> https://wiki.duraspace.org/display/DSPACE/DSpaceContributors
[20:50] <kompewter> [ DSpaceContributors - DSpace - DuraSpace Wiki ] - https://wiki.duraspace.org/display/DSPACE/DSpaceContributors
[20:52] <luizsan> ok, I'm not a contributor! LOL
[20:54] * dyelar (~dyelar@rrcs-74-62-138-4.west.biz.rr.com) Quit (Quit: Leaving.)
[20:55] <hpottinger> so, with terry-b we have three, we can continue, I suppose, though we only have about 6 minutes in our hour
[20:57] <luizsan> Ok, I'm sorry if I am annoying with you guys, but I saw in the Tim UI presentation DSpace needs volunteers to RESP-API, what is the best I tried to find some tickets in the JIRA about REST-API but I was not able to find it.
[20:58] <luizsan> *what is the best way to do it
[21:02] <terry-b> luizsan, The REST work is probably not in JIRA. I will send you a link to one of the UI working group meetings. You would probably join there are ask tdonohue for some guidance on where to start
[21:03] <terry-b> Check out https://groups.google.com/forum/#!searchin/dspace-tech/angular/dspace-tech/dpfmHx1wdc8/emt6eL8XOwAJ
[21:03] <kompewter> [ Google Groups ] - https://groups.google.com/forum/#!searchin/dspace-tech/angular/dspace-tech/dpfmHx1wdc8/emt6eL8XOwAJ
[21:03] <luizsan> Thank you guys!
[21:07] <terry-b> It will be great to have you involved on the project!
[21:13] <mhwood> Got to run. Bye all!
[21:13] * mhwood (mwood@mhw.ulib.iupui.edu) has left #duraspace
[21:18] <hpottinger> well... I say we're adjourned :-)
[21:28] * luizsan (~luizsan@ has left #duraspace
[21:29] * hpottinger (~hpottinge@mu-161168.dhcp.missouri.edu) Quit (Quit: Leaving, later taterz!)
[21:52] * tdonohue (~tdonohue@c-98-220-55-31.hsd1.il.comcast.net) has left #duraspace
[22:35] * Dylan (~Dylan@ has joined #duraspace
[22:40] * Dylan (~Dylan@ Quit (Ping timeout: 276 seconds)
[23:59] * Dylan (~Dylan@ has joined #duraspace

