I am currently reading: Coding with XML for Efficiencies in Cataloging and Metadata by Timothy W. Cole, Myung-Ja (MJ) K. Han, and Christine Schwartz.
I have been working with XML for a few years now in different contexts – but I have always been working with the structure itself – not its search, transformation, or display.
The other key relevance to work is a digital preservation platform we are migrating to from contentDM may force us, for certain types of digitally preserved materials, to write XSL. If we need this, I will be able to help more now. 🙂
Thank you to the authors for this book
The Library of Congress publishes a set of the standard Geographic Codes that go into catalogue Machine Readable Catalogue (MARC) records. This is now called MARC21 and is just one of the record formats that metadata technicians and cataloguers use to create and structure information in a networked environment.
These are the records you see when you search your local library for items.
The purpose of the geographic codes is to add a fixed version in standardized code form to reflect any geographic subject or relationship of the work in hand. It is actually these fixed forms of information that the computer reads.
For example: We read the words that say “United States,” “Indonesia, “Iran” or “Kenya” in the subject section of the item-record in the catalogue. This text is usually hyper-linked because we can use it to browse other resources that are categorized under the same subjects or have similar geographic relationships. These terms are often added to the 650 or 651 field in the MARC21 records.
But the computer needs standardized forms of this information in order to organize properly. Thus, “United States” is read by the computer as : ‡an-us—, Iran is read as : ‡aa-ir— and Kenya is read as : ‡af-ke— These code marks all go, as many as are needed, into the 043 field.
There is even a code for the whole Earth : ‡ax—— and for the Solar System : ‡azs—–
But the Marc21 Geographic Code list, for all it might be criticized for, is missing a fundamental geographic representation – that of any code to reflect the Internet or any time-space segment of reality connected with CyberSpace.
I think it’s high-time to repair this gaping hole in the Code List.
We need a geographic code that represents the Internet, Cyberspace, the World Wide Web (Inter-webs) – whatever you call it in your language – as a discrete and specific geographic location.
I have ideas too…
We can’t use ‡ai—— (which could stand for Internet) because that is claimed for Indian Ocean. And we can’t claim ‡ac—— (which could stand for Cyberspace) because this has already been claimed for Intercontinental areas (Western Hemisphere). No, we need another code.
And I just happen to have found a gap in the code sequence allowing the perfect code to slot in.
I think we should use : ‡ait—– (to represent the Internet). 🙂 Not only does this code arrangement reflect major letters in “Internet,” but it also accomplishes a secondary goal of reflecting the work of the Internet itself : IT (Information Technology).
This is revolutionary…
Who’s with me?
Thanks for reading.
From CHICAGO’S McCormick Place/Convention Center: Watching traffic heading south along Lake Michigan with the lake a mere 200 yards in the distance – beautiful.
Vendors and exhibitors are currently setting up for 5 days of learning and connecting on all things library. This will include books, ebooks and author events of course. Many publishers are even in attendance. But there are lots of technology vendors as well as committee meetings engaging in “think-tank” planning for the future of academic, public and school libraries’ futures. This exhibition/conference will bring together the current and proposed best practices in technical and patron services.
It’s not too late to register. I for one am excited.
Make sure to follow Ala Annual 2013 events on Twitter with the hashtag: #ala2013
I’ll be tweeting through the event from @jltaglich and @meta21st
Don’t hesitate to chat or express all thoughts.
Thanks for reading.
This entry was posted in ALA, internet, knowledge organization, librarianship, libraries, library, library classification, library collections, library design, machine readable information, MARC, metadata, networking, public libraries, reference works, social media, subject access, Technology in Libraries.
In another post, the idea was brought up that there is a disconnect between information that humans make, produce or understand (think) and information (data) that computers are structured to use as they communicate with other parts of the machine (or between machines). This might have as much to do with tagging posts in a blog, adding labels to items posted publication platforms such as Google’s Blogger or writing descriptions while cataloging items in Millennium or Ex Libris Voyager. These last two software options are interacted with via a library’s search catalog in their OPAC or publicly available URL. The previous interfaces are different.
There are some similarities between each of these. But basically, the similarities revolve around code built into the systems because these are assumed to be how knowledge is categorized. The above article highlighted as “tagging” suggests platforms such as WordPress have categories and tags. The blend of these features create a general “box” for the knowledge in said post while the tags allow for a little nuance added that supposedly helps “aboutness” to be more clear for readers. The fact of this knowledge organization structure is assumed with the use of the technology and there is no more available to the user of the technology at any give time except for what the designers have assumed as more correct (or justified) at the time. Every piece of machinery has this arrangement, but the ubiquitous quality of these technologies’ use currently means that these set modes of knowledge organization are hoisted upon more and more people.
Millennium and Ex Libris Voyager have their own set of built-in assumptions about knowledge organization and own ways of applying metadata to items – in this case surrogate records for items that are not the record itself. The distinction between the post and the surrogate record means that even though there are still many machine-specific assumptions in every technology mentioned thus far, the surrogate is STILL a very different interaction because it is not necessarily read for its own sake in most cases. Both of these technologies have certain set fields within their interfaces that cannot be changed – even if they can be fine-tuned to a much far greater degree than any of the web-publishing technologies mentioned above.
Today, however, I was in a conversation with a polyglot cataloger of serials in many languages (currently working with a collection of items from Harry Houdini‘s library donated to special collections) with the Library of Congress. The conversation was specifically on data-about-data (metadata) and the ways in which technologies do and do not accomplish certain jobs which they could accomplish if certain arrangements were different. She told us that even with the code-style used with cataloging (MARC – Machine-Readable Cataloging), all the detailed set of rules for each field and sub-field (including the formatting of those sub-fields) and all the facets of information able to be added to the surrogate record made in the cataloging module, the technology is still quite limited. By this she meant at least one important point – that even though there are so many methods within this technology to describe artifacts, the human mind understands and is frustrated by the singular method offered to accomplish the cataloger’s goals.
The same conversation included a man, also from the Library of Congress, but from the Preservation Directorate – Re-formatting Division, who has written on the modes of expression possible in describing any given work that are not used due to who has already decided what kinds of information counts as data. There are a great number of factors in these decisions, but much of them have to do with socio-economics. These decisions do not revolve around issues about people or writing. Rather, they are also tied to “truths” about physical and mathematical sciences from positions of power. For a good read on this topic, I heartily recommend “Cataloging Theory in Search of Graph Theory and Other Ivory Towers,” a paper that has this post’s topic as one facet. The paper is available in a pre-print format from American Library Association here. And again, both of these library minded people recognize that even though computers and IT-minded groups/companies have done a lot in the world, they may not have set the world up for a multitude of knowledge organization structures even though most technologies in use today are capable of so much more than what is being taken advantage of at the present time. Machines do certain things really well. But they only do what they do. Humans do the rest (and built those machines).
As always, dialogue is welcome here or @ Twitter.