look out honey 'cause I'm using technology

Blog

this is vendor lockin…

home_cartoon_1


File system full, but why?

0101010101UPDATE: posted my workaround code below, good feedback already from Ryan (djatoka dev) and I’ll be testing the proper fix on the server soon.
I’ve got a server that keeps filling up its disk space and failing to serve images after it gets to the file system full error message.  First of all let me say, I don’t blame it in the least, if the admin (aka me) doesn’t do enough to secure the server enough disk space to do its job, I say, let me have it.  But after I’ve set the suspect daemon to use a *reasonable* amount of space I stopped thinking of it as the culprit, so when this issue arose again, I looked elsewhere for the cause.  Fast forward to today, the server’s file system filled up again, and refused to serve any more data, again, I totally understand where the server is coming from, if it doesn’t have enough disk space to do its job, it shouldn’t have to apologize to anyone; it’s all on the admin (again, aka, me), but what was going on? (more…)


Four free Linux eBooks

tux.jpgWhile looking for something else, (which is mainly when I find *other* interesting things) I found an article which included links for four free Linux eBooks. This is a great resource for anyone with some Linux experience, back to others who may be looking to get started with tux, and I would have loved to have this when I started, but that was before the Internet was available to most people. So, if you’re new to Linux, or want to get started (I used Red Hat Unleashed in 1996, here it is online!), here’s some great downloads to learn from: (more…)


Resolving LSIDs with URL resolvers and CouchDB

346483297_c4cb93ab4e_mRecently I’ve been looking at ways to solve some of biodiversities’ long standing issues with LSIDs, which are, “Life Science Identifiers are a way to name and locate pieces of information on the web. Essentially, an LSID is a unique identifier for some data, and the LSID protocol specifies a standard way to locate the data (as well as a standard way of describing that data). They are a little like DOIs used by many publishers.“  I posted my thoughts to the TDWG discussion mailing list on the topic, and am reprinting it here to allow for further community commentary; Code4lib, I’m looking at you. While much of it is theoretical, it is doable, and if it covers all that needs to be addressed, would be a cool, sustainable way forward for link resolvers for all kinds of usage.

I’m with Tim on this one, and taking one of Rod’s other posts (“LSIDs, disaster or opportunity“) a bit further, I think coming up with a simple, extend-able URL resolver would give us many benefits and allow LSIDs with extra, added information around them for all to use. Looking at his example, a URL would get permanent tracking that would also post referrers, location and traffic. A summary of the link could even be a page in itself, a cached version, a screenshot, or just a scrape of the code – pulling out the HTML tags, for future reference in case the real link goes down. We could use the ability to create a customizable prefix (ie- http://someresolvr.com/bhl/SDFoijF), to somewhat follow DOI conventions, but could even save old DOIs or handles for historical purposes in a field attached to the new URL, or for reuse, making the new URL resolve to a current DOI with a simple post at the end of the new URL (ie- http://someresolvr.com/bhl/SDFoijF/DOI). In the same way we could use user input, data pulled about the URL semantically to generate RDFa (by using pyRdfa), then exposing that for all newly created URLS, and coming up with a standard to make it predictable (ie- http://someresolvr.com/bhl/SDFoijF/RDF). The example at bit.ly shows the use of Open Calais to get more background information on the original link to provide more information, but it could also be pointed to other services we provide/use in biodiversity to provide a snapshot across the board of more context/content. Users of the service could login to examine/add/edit the data by hand if desired, so they would still retain ultimate control over how their record is presented. Thus, from a simple URL, we could build a complete summary that would build on what we’re given while sharing it all back out.

Then the architecture (aka, the fun part) would be simple and distributed. A webserver able to process PHP, running the database CouchDB would be all that is needed to run the resolver. CouchDB is schema-less, so the way it handles replication is very simple, and is built to be distributed, only handing out the bits that have changed during replication, as well as scale in this manner. Having a batch of main servers behind a URL in a pooled setup (think of a simplified/smaller version of the Pool of Unix networked time servers) would allow a round-robin DNS, or a ucarp setup (“urcarp allows a couple of hosts to share common virtual IP addresses in order to provide automatic failover“), so if one main server went down, another would automatically take over, without the user needing to change the URL. Plus, if we wanted to, to battle heavy usage of the main servers we could use the idea of Primary and Secondary servers as outlined in the pool.ntp.org model, so an institution with heavy usage could become a Secondary host and run their own resolver simply, with almost no maintenance. They would just need the PHP files, which would be a versioned project, and then have a cron task to replicate the database from a pool of the main servers. The institution’s resolver could be customized to appear as their own, (ie- http://someresolvr.bhl.org/bhl/SDFoijF) and for simplicity could be read-only. This way a link like http://someresolvr.com/bhl/SDFoijF could be resolvable against any institution’s server, like http://someresolvr.bhl.org/bhl/SDFoijF or http://someresolvr.ebio.org/bhl/SDFoijF – as all of the databases would be the same, although maybe a day behind, depending on the replication schedule. New entries would only be entered on a main server, or in ‘the pool’ (ie- http://pool.someresolvr.com/), then those changes would be in the database to be handed out to all on the next replication (I won’t add my P2P ideas in this email – it may not be needed for the deltas that would need to be transfered daily or weekly). Add to all of this that CouchDB is designed as “…a distributed, fault-tolerant and schema-free document-oriented database” which would fit into what we want to do; build a store of documents (data) about a URL that we can serve, while being a permanent, sustainable resolver to the original document. If the service ever died, it could be resurrected from anyone’s copy of the database (think LOCKSS (Lots of Copies Keep Stuff Safe)), so that no data (original or accumulated) would be lost. The data could be exported from the database in XML, and then migrated from that to a desired platform.

I have not been dealing with LSIDs as long as most on this list so I expect I’m glossing over (or missing) some of the concepts, so please let me know what I am lacking. This is a needed service, and is a project I’d like to be involved in building.


Red Dwarf: Back to Earth

red-dwarf_first-shot_1000v2

The British comedy Red Dwarf has been a favorite of mine for many years, and this year it celebrates its 20th year anniversary.  While they show hasn’t *constantly* been in production this fact is a bit misleading, but regardless, this year the crew of The Cat, Rimmer, Lister and Kryten are reuniting for a new 3 part series, Back To Earth. (more…)


We like








We support


EFF - Electronic Frontier Foundation       TOR - The Onion Router       HRC - Human Rights Campaign








geek

Yummy!
School spies on student, busts him for…eating candy

Today fak3r from fak3r.com and Matt from Obtuseview.com are working together to bring you a multi-p

More in geek

politics

Twenty-six Lies About H.R. 3200

With all the craziness around the health care debate, the facts are getting lost.  There is simply

More in politics

music

Best music of 2009

Well 2009 was another stellar year for music if you ask me, and as usual, my yearly ‘top̵

More in music

art

Dark Night of the Soul

Notice: the text of this post in the gray, blockquote area was taken from the website Look Into My

More in art

news

Yummy!
School spies on student, busts him for…eating candy

Today fak3r from fak3r.com and Matt from Obtuseview.com are working together to bring you a multi-p

More in news
Private