Taxonomy of information work

I am now in extraordinary Budapest, Hungary.  I haven’t done much exploring yet, but the glimpses I have seen from the taxi from the airport have been tantalising.  I am here for the 2nd International Conference on Integrated Information (IC-ININFO – see and am making last minute adjustments to my paper and presentation (as you do!).  I attended the first in this series last year on Kos, Greece, and enjoyed it thoroughly.  It is really a different kind of conference – the only one of which I am aware which really does get together people from all points of the information spectrum.

In my work this morning, I re-discovered a taxonomy of Library and Information Science (for want of a better term) which I developed about  five years ago, in order to lay out the knowledge area/practice of thos involved with work in cultural institutions of all types, but notably galleries, libraries, archives and museums (GLAMs).  I hope you find it interesting!  I would welcome your ideas and discussion on this, as I firmly believed that we are charged with two tasks at the moment:

1.  Being able to say clearly to non-information workers – and, yes, to others that work in different branches of the metafield – what it is we do and why we are so necessary to society; and

2. Developing a manifesto as a united body in order to persuade the powers that be that far more attention (and money) should be devoted to this kind of work, in order for the technology to develop in socially effective ways.  (I’m thinking that the EU plans for Information Society have fallen into a deep hole of technological determinism and will not otherwise find their way out).


Addendum: Taxonomy of LIS: the people who run Cultural institutions


The study of the creation, communication, recording, organisation, retrieval and preservation access and interpretation of information and its social effects.


Knowledge creation

Indigenous knowledge systems

Research approaches and methodologies

Creativity and innovation

Knowledge representation and communication

Representation of information in language



Scholarly communication

Cyberinfrastructure (e-research, e-science)

Recorded information

History of writing: alphabets and numbers

History of documents: formats and types

Information design

Document design and typography

Information architecture (document design on the Internet)

Document access for the disabled, e.g. talking books, Braille, Kurzweiler machines, etc.

Knowledge creation and communication, and document types

(by discipline and/or other characteristics, e.g. children’s literature; literature for neo-literates, etc.)

Human information behaviour

Identification of information needs/problems

Information behaviour of communities and groups

Information literacy (making meaning)


Critical literacy

Bibliographic literacy

Media literacy

Information usability

History and scope of information professions

(Those who deal primarily with information recorded on/in information objects such as documents).


Records Management

Electronic records management

Archival science

Manuscript management

Document and object conservation

Document and object preservation (including digital preservation)

Museum studies

Curatorial studies

Corporate information management (Note: ‘information management’ usually refers to corporate or organisational document management).

Knowledge management

Competitive intelligence


Community informatics

Development informatics

Health informatics

Social informatics

(Other informatics)



Physical document collections

(Libraries, information centres, archives, records centres, galleries)

History and evolution of each type of document collection

Types of libraries










Objectives of each type

Functions of each type

Document and artefact management – physical and virtual

Construction of metadata codes

Development of taxonomies (boundaries and structures of each knowledge domain; ideally should show intersections with other domains)

Development of ontologies: representation of information in codes

Classification codes

Enumerative hierarchical systems (e.g. Dewey)

Faceted classification systems (e.g. Ranganathan)

Indexing languages

Enumerative hierarchical systems (e.g. Library of Congress subject headings; MESH)

Faceted indexing systems (e.g. Precis)

Thesaurus construction

Semantic Web

Organisation of information resources (i.e. documents)

Bibliographic analysis and description

Systematic bibliography

Analytical bibliography


Content, concept and discourse analysis




Mark-up languages (e.g. MARC, XML, RDF, etc.)

Service models


One-to-many (passive; standard in most libraries-as-place)

One-on-one (interactive; more common in special libraries)

One-on-one ongoing continuous over time (highly desirable but rarely encountered)

Outreach services (e.g. housebound and neo-literates) (a variation of one-to-many)

Mobile services (variation of one-to-many)


Digital libraries (remote access to digitised documents)

Online reference (usually email; can be VOIP e.g. Skype)


Interactive social networking techniques, e.g. social bookmarking, blogs, Flickr, RSS feeds, etc.

Second Life

Information retrieval

(Using systems, codes or programs to locate documents and information)


The reference interview and question interpretation

Retrieval techniques and processes

Metadata retrieval (from flat files and relational databases)

Full-text retrieval (from relational databases and hypertext)

Sound retrieval

Image retrieval

Video (or multimedia) retrieval

Information sources and retrieval (by discipline/group)










The role of information in society

Social effects of writing

Social effects of reading

Social effects of documents

Social effects of libraries, archives and other information/cultural centres

Libraries as cultural interventionists and mediators

Libraries in a multicultural global society

Transformative effects of information

Individual learning and development

Societal development

Social capital and social cohesion

Democracy, governance and citizenship

Social and community networking

Social entrepreneurship

Information ethics and laws


Intellectual property





Remiss or just missing?

Well, I’ve been both remiss and missing.  To misquote John Lennon, sometimes life gets in the way when you’re making plans.

There seem to be more activity now – at the  bureaucratic and perhaps even policy levels – to acknowledge and perhaps merge the cultural institutions – namely libraries, archives, galleries and museums.  These are known variously as GLAMs or LAMs, depending on how inclusive you want to be and where you live.  Why is is that there is always some difference or discrepancy in the vocabulary used in this field???   If you have read some of my previous rants, you will know that this is something that irks me, and, in my view, has created not only conceptual obfuscation (deliberate choice of word), but also is leading to the clear demise of the associated professions – particularly librarianship and recordkeeping/archival work (should that be archivism?).

Moving on from semantic issues, it has been long recognised that the institutions that collect, preserve and provide access to recorded cultural memory all share similar goals and, by and large, similar procedures.  (See for example the 2008 IFLA report:  Sometimes this seems to occur willy-nilly, for economic reasons, that not all are happy with (   Well, yes, the procedures do appear to be similar – in essence, if not in detail –  as well as the goals – so that the viewer/visitor can access and better understand them.  Documents  (I use the term loosely, to include any recorded expression of human thought)  are collected or selected from the universe of available documents, according to varying guidelines and constraints.  Selection is made of which documents to keep by records managers, before the archivists get hold of them, even though archivists claim not to ‘select’ those documents they keep, as such.  Museologists are constrained, to a large extent, by what is ‘found’, even though items can especially be collected for them – even if only as a conquest of war, like the Elgin marbles from Greece, now unhappily resident in the British Museum.  Galleries will deliberately collect works of a specific type, age, authorship or perhaps nationality: the Tate Britain, the National Portrait Gallery and the Guggenheim Museum in New York are as distinguished by their collections as by their architecture.  Libraries, of course, select materials according to the (little understood) needs of their communities, the space they have available and their budget.  All of these ‘collections’ are, to a greater or lesser extent, reflective of their prevailing political regime, whether intentionally or not.

And there seems to be little disagreement about this.  As I have previously noted, I am of the view that all these professions belong to a metacommunity of information professionals, which may include such information technologists as are involved with the creation of digital cultural institutions, their description, storage and preservation.  Not all information technologists have equivalent expertise in this dimension.

The problem of digital collaboration in our new information environment is, unfortunately, far more profound and still recondite.  Providing a single point of entry into a heterogeneous world of virtual documents, each of which may reside in quite different physical spaces, sounds wonderful.  And indeed it is: not only because it is clearly impossible for every information seeker to visit every venue which holds potentially useful documents, but also because the juxtaposition of virtual documents provides the opportunity for new insights and fresh intellectual synergies.  It also means that the ‘user’ – so far, constructed in the information professions as various ‘types’ or rather generalised caricatures – is even less defined.  The virtual visitor to, for example, the painting ‘The battle of San Romano’ by Paulo Uccello,  located in the famous Uffizi Gallery in Florence, may be an art historian or a child; a costume expert or a stage designer.  While we can, and perhaps we should, provide context to the digital documents that we place in the virtual world, what elements of context are important?  Should we link such a work to the artist’s biography, the history of the battle, the development of perspective, the use of particular weaponry, contemporary artists, authors and philosophers – or the ways in which Uccello mixed his paints?  Indeed, everything is connected to everything else in some way or another.  And there are degrees of intellectual complexity as well, from beginner to expert.  In my opinion, these links or associative trails are what the internet is best at, and should be fully exploited, as nothing happens without a context of some kind, and understanding this context enables us to better understand the idea.  A non-LIS book on the topic of context was recently published: ‘Situations matter: understanding how context transforms your world’ by Sam Sommers.

And this leads me to what I see as the crucial problem facing GLAMs: the notions of multidisciplinarity and interdiscplinarity.  These terms appear interchangeable, but there are in fact, real differences.  ‘Multidisciplinarity’ refers to problems which require the expertise found in various different knowledge domains or disciplines.  Each discipline will retain its own methodologies and theoretical frameworks in order to solve the problem: these are not ‘shared’ between the disciplines.   Interdisciplinarity, on the other hand, transcends, or is found in between,  any knowledge domains which claim to be a discipline.  In other words, by selecting elements of the various theoretical components (objects of study,  ) from two or more disciplines, a new ‘interdiscipline’ is formed.  An example, perhaps, is biochemistry.

Leaving aside the question of whether the traditional information professions (such as librarianship) have associated academic disciplines, which I have discussed elsewhere, it seems as if a new ‘interdisciplinary’ discipline is now required, to provide a theoretical framework for the work that is already taking place towards collaboration, not only amongst the GLAMs, but also including other disciplines: computer science, of course, but also historians, anthropologists, sociologists, psychologists, designers and many other groups who could contribute to the continually ongoing manifestation of virtual information space.  This is not new: you can take a look at or or (2001!) or (from 2007) or, to get a taste of the zeitgeist (and mix metaphors).  But very little has been actioned, and one reason, I believe, is that the administrators do not really ‘get’ what we are all about.  Being clear to them means being clear to ourselves, and this is another reason why a theoretical framework for this field is important.

There are clear steps that guide the creation of a theoretical framework for this inclusive field:

1. Identification of the persistent or seminal entities and phenomena in the particular fields (i.e. those that are of interest to all groups involved).  This is the ontology.

2. Discovery and enunciation of the interrelationships between these entities and phenomena, which are called propositions or principles.  This is called a taxonomy.

3. Establishing the axiological commitments, and the ways in which ‘truth’ may be revealed.

4. The rules or principles that exemplify the interdiscipline – nomos.

5. The purpose or goal, or social responsibility, of the interdiscipline: the teleology.

Constructing a theoretical framework is part of the overall process of theory development, which is primarily a sequential process that begins with a broadly based descriptive and exploratory study, proceeding to the generation of explanatory studies, which may be accompanied by quantitative correlational studies.  The methodology of theory building, as suggested by Steiner (1988), involves criticism of extant theory, including explication and evaluation; and construction of new theory, by way of emendation and extension (Steiner, 1988, p. 1).

Why is a theory important?  So that we have conceptual clarity about what we work with, what we do, the relationships we have to each other and to our communities, and that we can appropriately structure education for the next generations.

Thinking about this will keep me busy until I write again.

Steiner, Elizabeth.   (1988).  Methodology of theory building.  Sydney: Educology Research Associates.

News that’s hot

A conference is currently being held called ‘The Future of Archives Symposium’ at Missouri University.  If you or your colleagues can’t make all of the events on the day, you can watch the streaming video or, shortly after the event, watch recorded sessions at:

or stay in touch through Twitter Hashtag #mudigital or Facebook:

A full schedule of the Symposium and other details can be found at

So, to kick off with definitions

I was taught, back in the day, that when indulging in academic discussion, it was vital to ‘first define your terms’.  So, I’ll bravely (or foolishly) start the discussion on definitions – or explanations – in order to achieve, in the end, mutually understood concepts.  I suggest that the aim here is not so much to attempt to develop a single phrasing or understanding of a term, as much as to understand how each term (‘word’ or ‘phrase’) is used within particular disciplinary territories, or perhaps even for different purposes.  In other words, how are these ‘terms’ conceptualised?

And I’m going to, perhaps even more bravely or foolishly, start with the terms that are so commonly used in our disciplines/professions.  (You’ll note that I put these words together, to suggest their strongly interrelated nature, much like Foucault used ‘power/knowledge’ for the same purpose).  These words are, I believe, data, information, knowledge, documents and records.  After a great deal of exposure to the literature in librarianship, information science, recordkeeping and archival science, I remained frustrated by the overwhelming number of definitions of these terms, and even more by the total lack of agreement and consistency in these definitions.  This gave me the impetus to study this topic in some detail for several years, and I arrived at the following conclusions.

The predominant image or metaphor currently expressed is that of a hierarchy, with ‘data’ at the bottom of a pyramid-shaped structure, supporting ‘information’ at the next level, and topped by ‘knowledge’.  The explanation is given that ‘data’ are the primary construction element: when these are ‘processed’, they become ‘information’ which, likewise, when processed, becomes knowledge.  Exactly what happens during the ‘processing’ phase is not explained.  It is presumed that this can be by a computer, and so ‘data’ can be seen as synonymous with ‘bits’, which are processed and understood (by the computer) as ‘bytes’, these bytes can, in certain sequences, be translated into various symbols (such as letters of various alphabets, punctuation marks, etc.) and so, in various combinations, form ‘words’.  These ‘words’ are not, of course, understood conceptually be the computer as referring to any other entity or phenomenon: they are simply sequences or patterns.  Algorithmically (and computer scientists, I stand to be corrected here), such patterns can be programmed as ‘correct’ or ‘incorrect’ – hence the development of spellcheckers, for example.

The data-information-knowledge model therefore may be useful to computer scientists, if ‘data’ as a term is seen to represented the concept ‘bits’ – the presence or absence of an electrical or electro-magnetic charge.  However, it also suggests that computers are capable of producing ‘knowledge’, which is a conclusion with which I disagree.  Quite strongly.  If this model is used in a human context, it suggests that we accumulate ‘data’ somehow from our environment, and these include things like temperature (rather than the experience of heat or cold), and then process them into ‘information’ – presumably using only our cognitive abilities, which are sometimes regarded as little more than the add/subtract/compare processes of the computer. ‘Data’ are understood largely, in addition, to being ‘facts’ – many dictionaries provide this as an explanation of the term.  The problem with ‘fact’ is twofold: firstly, it suggests that it is ‘true’ – and of course the notion of ‘truth’ and what it is remains largely unresolved – at least in philosophical circles; furthermore, ‘facts’ are socially constructed.  We make ‘facts’ through the ways in which we frame time, space, measurement, power, and so forth.  ‘Information’ is understood to be some kind of result of analysis of the data having been ‘processed’.  But does this mean learned, understood, made meaning of? And information in turn is ‘processed’ (once again, it is not clear what activities are included) in order to become knowledge.  Distinctions are not drawn between the kinds of knowledges that we have: knowledge of things, knowledge about things, knowledge how to do things, etc.  I strongly resist the concepts of  ‘tacit’ and ‘explicit’ knowledge, however, which will become clearer later.

I am of the view that ‘knowledge’ is the place to begin an analysis of the other two concepts.  ‘Knowledge’ is what we, as human beings, have: it is what we ‘know’ – whether we know we know it or not.  Sometimes we have forgotten ‘stuff’, or do not realise that we know ‘stuff’. We acquire knowledge in a number of ways: firstly, we are born with the ability to acquire knowledge and language, after birth we experience the world through our five senses. Aristotle was particularly keen on this idea; Plato felt that our world was a mirage of the true or essential world. We also require knowledge vicariously, through other people, or rather, other people telling us of their direct experiences. If they record these experiences in some way, for example in writing, we may still learn from and experience their experiences across time and space.

So, I understand information to be that part of the persons knowledge that he or she chooses to share our with others, and which he or she will represent using a language of their choice, which may be spoken language, or dance or art or mathematics, for example. Our understanding of the message will depend upon our ability to decode their language: we must be able to interpret the symbolic representation of their ideas. Knowledge is, as previously stated the sum total of the accumulation of ideas and experiences that we individually possess. Data culturally or socially constructed symbols; they may be numbers or figures, or statements: in either case, there are embedded in a particular contextual understanding and represent nearly what is believed to be true at a particular moment and in particular space (which is why I am able to refer to the ideas of Aristotle and Plato).

Finally, I will say that to extend this understanding of data, information and knowledge, (DIK) I further believe that  information is represented in language, and rerepresented in writing–symbols that represent spoken language– which can  be recorded on a material which may or may not be more less durable, and that material constitutes a “document”.  Thus, a document can be understood to be a container of information. Some documents provide evidence of a business transaction, and these documents known as “records”.

I look forward to your analysis, critique, and commentary on these ideas.

All the best, Sue