CHAPTER FIVE

THE BIBLIOGRAPHIC RECORD IN THE ONLINE ENVIRONMENT: CONTENT AND FUNCTIONAL ANALYSIS

0 INTRODUCTION

0.1 The aims, scope and approach of this chapter

This chapter is in two parts: Part one will analyse the bibliographic record in terms of the data elements which are used to describe and provide intellectual and physical access to bibliographic entities. Its main aim is to explore the content and functions of bibliographic records as a whole and with respect to the online environment's possible impact on them. Through an analytical approach and a matching technique, the study will help to identify and examine the elements necessary for each function of the record and how this is influenced by the capabilities and characteristics of the online environment. Functional analysis will be concerned here with a review of the relevance of individual data elements in relation to the processes for functional areas: searching, retrieval, browsing and display of bibliographic records. In this context, the chapter is prerequisite to Chapter 6, in that the re-examination of cataloguing principles and rules needs to be carried out with respect to the different functions of the bibliographic record and the various data elements within it.

Part two will demonstrate the need for a greater uniformity in the indexing of data elements for online retrieval which may significantly influence the functionality of the catalogue record.

Based on the recommendations of the Seminar on Bibliographic Records, held in Stockholm in 1990, the IFLA Study Group on Functional Requirements of Bibliographic Records has undertaken to define the functional requirements of bibliographic records in relation to both the variety of user needs and media (Bourne, 1992: 145). The purpose of the Study, an ongoing work in this area, is "to delineate in clearly defined terms the functions performed by the bibliographic record with respect to various media, various applications, and various needs" (IFLA Study Group on the Functional Requirements of Bibliographic Records, 1992: 1). The Study attempts to encompass all major types of library materials. From the approach of the IFLA Study Group, it can be implied that it was driven partly by the idea of the size of the bibliographic record -- minimum requirements for national bibliographies-- as well as by the economies which can be made in its creation. To recommend a basic level of functionality that relates specifically to entities was an additional charge given to the Study Group. ,

This chapter attempts to complement IFLA's work, in that it will consider the influence of the new technologies (e.g., the indexing of cataloguing data and the search/retrieval/display capabilities of online systems as well as networking) on the functionality of individual data elements in the bibliographic record. For example, in terms of the functional requirements of data elements used for searching and retrieval of remote databases, the attributes defined in the Z39.50 standard will be taken into consideration. By discussing the need for individual data elements in different communities, this chapter will attempt to justify the need for a fuller level of description in bibliographic records functioning in an online environment. The chapter will also attempt to categorise data elements according to their various functions and provide the necessary background for the discussions in Chapter 6 on how the functions of the bibliographic record should be taken into consideration in any re-examination of cataloguing principles.

PART ONE

IDENTIFICATION OF THE CONTENT AND FUNCTIONAL ANALYSIS OF THE BIBLIOGRAPHIC RECORD

This part deals with the content and functional analysis of bibliographic records with respect to the needs of different communities in general and the library community in particular. Functional analysis is the analysis required for identification of the types of data elements, their various uses and the requirements associated with a particular function.

Before examining the functions of the bibliographic record in different contexts and to determine which data elements perform which functions, it is appropriate to commence with a brief study of trends in the co-operative creation and use of bibliographic records by different communities such as publishers, booksellers and librarians.

1.0 The bibliographic record in the bibliographic chain

As electronic access to bibliographic files is becoming increasingly commonplace, there is a parallel awareness of the advantages of co-operative creation and use of bibliographic information by different communities. Almost everyone involved in the book world in the developed countries produces, exchanges, edits and accesses bibliographic data. Thus the bibliographic record in the online environment reflects a wide spectrum of objectives, interests and uses. The identification and functional analysis of data elements used at each stage of the bibliographic chain can help provide a more comprehensive approach toward the rationale and requirements of bibliographic records.

A major concern for bibliographic control in different communities relates to the duplication of effort and resulting expense in the creation of bibliographic information. In fact, the chain is the sequence of bibliographic activities by different interested parties which have some elements in common. These elements should not be duplicated where it is feasible to avoid duplication. Ideally, a bibliographic record for each publication should be created only once and thereafter used and built upon by all persons involved in the bibliographic chain. Libraries, being more dependent on bibliographic records, would benefit more than others. A record created by publishers at the first stage would be useful to both librarians and library patrons. Descriptive and, particularly, subject description elements such as abstracts, pages of contents, back-of-book indexes and identification of the target audience, are very useful to libraries. In reality, the special needs of different users as well as economic, technological and administrative factors would influence the achievement of this concept. In terms of this ideal, Hagler (1991: 107) asks:

Who should create the model record? Who should be responsible for transmitting it to its potential user(s)? What is the best method of communicating it? Who should pay the costs involved? How can quality control be administered if responsibility for record creation is dispersed among different agencies? What priority should be assigned in pursuing the ideal in areas where it conflicts with local service and administrative considerations? None of these questions admits of any single or any permanently valid answer.

A number of national and international meetings have been held over recent years focusing largely on the creation and use of bibliographic information in a variety of environments and also on the need for a re-examination of the concept of the bibliographic record. The Newbury Seminar on Bibliographic Records in the Book World (November 1987), which was one of the first attempts in this area, aimed to identify real requirements for bibliographic records by various users such as librarians, booksellers and readers (Greenwood, 1988). The Seminar on Bibliographic Records, Stockholm, 1990, sponsored by the IFLA Universal Bibliographic Control and International MARC Programme (UBCIM) and the IFLA Division of Bibliographic Control is one of the most recent attempts in this field (Bourne, 1992). The Seminar was basically concerned with the possibility of introducing a common approach towards the creation of bibliographic records at a national level through the coordination of the different operations in the bibliographic chain.

The Book Industry Communication (BIC), set up in the UK and sponsored by the Publishers Association, the Booksellers Association, the [British] Library Association and the British Library has as its major objective the development and promotion of standards for electronic communication of information within the book industry (Book Industry Communication, 1992: 8). To achieve this objective, the BIC has developed guidelines and standards for the construction of publishers' bibliographic databases (Ibid: 3). One of these guidelines provides publishers with the identification of those data elements (see Matrix 5.1) which should be included in the bibliographic database. As is seen, there are signs of a demand for extra descriptive information in bibliographic records. Most data elements used in the book world have relevant fields in

Matrix 5.1 Data elements used in the book world and libraries*


                              Publishers'                                    
                              bibliographi Books    Libraries:   Libraries:  
Data elements                 c database   In       Acquisitions Cataloguing 
                                           print                             

Title Information                                                            

     Title in full            x            x        x            x           

     Short title              x                     x            x           

     Subtitle                 x            x        x            x           

     Series title             x            x        x            x           

     Volume/part number       x            x        x            x           

     Volume title & subtitle  x            x        x            x           

     Year of annual/yearbook  x                                  x           

     Edition                  x            x        x            x           

Author Information                                                           

     Main author/editor       x            x        x            x           

     Author data              x                     x                        

     Corporate or conference  x            x        x            x           
body                                                                         

     Other                    x            x        x            x           
authors/contributors                                                         

     Contributor function     x            x        x            x           

Publication Information                                                      

     ISBN or ISSN             x            x        x            x           

     ISBN comment             x                                              

     Current imprint          x                                              

     Publisher/distributor    x            x        x            x           

     Country of publication   x            x        x            x           

     Date of publication      x            x        x            x           

     Publishing history       x            x        x            x           

     Availability status      x            x        x            x           

     Reprint date             x            x        x            x           

     Price                    x            x        x            x           

     Market availability      x            x                                 

     Distributor and address  x                     x                        

Physical Information                                                         

     Number of pages          x            x        x            x           

     Number and type of       x                     x            x           
illus.                                                                       

     Binding                  x                                  x           

     Number of physical                    x        x            x           
parts                                                                        

     Dimensions               x                     x            x           

     Weight                   x                                              

     Language of text         x            x        x            x           

     Language from which      x                     x            x           
translated                                                                   

Content Information                                                          

     Fiction/non-fiction      x            x        x            x           

     Subject description      x            x        x            x           

     Readership level         x            x                     x           

     Summary                  x                                  x           

     Contents list            x                                  x           



*The first two columns are adopted from Book Industry Communication (1992). Column three is based on a recent (1995) issue of the 'Books In Print' in printed format. The last two columns are based on the information generally included in acquisitions and cataloguing records in many libraries now.

the MARC format. However, many libraries do not enter some of the data either they are ephemeral and variable over time or they are considered unnecessary within certain types of libraries. For example, many libraries do not enter prices in their records. Binding material may be noted in cataloguing for rare materials or very unusual binding materials. Also many data elements are entered by acquisition librarians in their records, but not in a consistent way and according to relevant standards as used by cataloguers. The Matrix shows that many similar data elements are used in the book world and libraries and can be shared for the purposes of exchange and avoiding duplication.

1.1 What are the functions of the bibliographic record?

As noted earlier, the bibliographic record is the principal means for bibliographic control. In a cataloguing context, the major functions of the bibliographic record are:

1) to describe the physical item at a self-sufficient level,

2) to identify the intellectual/artistic nature, content and scope of the work and to uniquely distinguish bibliographic entities,

3) to provide links to related works/items,

4) to provide common information for citation purposes, and

5) to locate the item for physical access.

The cataloguing community is attempting to achieve a thorough examination of all the functions of the bibliographic record to meet users' needs. The functions of the bibliographic record as a whole have been elaborated on by the IFLA Study Group on Functional Requirements of Bibliographic Records (1995: 7) as follows:

1. to identify bibliographic items uniquely,

2. to relate, that is, to indicate related bibliographic entities (e.g., through various linking devices as well as collocation in displays of records),

3. to assist in the choice of a particular bibliographic entity (e.g., through annotation, summary, etc.),

4. to assist in the delivery or retrieval of bibliographic entities themselves (e.g., through call number, physical location, text retrieval, etc.), and

5. to provide information about itself for database maintenance (i.e., to housekeep).

Computer technology has influenced the functions of the bibliographic record in many different ways:

1) it has made it possible both to integrate all separate functions and procedures and to make a single bibliographic record that is applicable to various operations in the library database management system. Integrated systems support more operations based on the same multi-functional, electronic record. It is, therefore, essential that the bibliographic record should satisfy the needs of cataloguers, reference librarians, interlibrary lending librarians, acquisitions librarians and, most importantly, the needs of endusers.

2) it has made possible the formatting and reformatting of records for different purposes in different environments.

3) it has facilitated record enhancement for better identification of and access to bibliographic entities through the addition of tables of contents, summaries, back of the book indexes and full texts.

4) it has facilitated the communication of bibliographic records between systems and within systems.

5) it has made bibliographic control possible in a more precise way not feasible in the manual environment. This is done through providing multiple access points, both customary and new.

The various functions of the bibliographic record are carried out through a set of data elements (i.e., attributes of entities) structured in a defined order. These data elements are derived from the entity at different levels of the bibliographic hierarchy. A thorough identification and functional analysis of data elements and data types helps to identify how the bibliographic record fulfils its functions in a database management system.

1.2 Data elements in the bibliographic record: identification and functional analysis

Few studies have been done so far on the identification and functional analysis of data elements in bibliographic records. According to Bregzis (1970), in the early 1960's there did not even exist a systematic listing of all data elements that could function as structural components of the bibliographic record. Three major studies on this issue are: first, a study carried out by Curran and Avram (1967). In this study, done for the Sectional Committee on Library Work and Documentation (Z-39) of the United States of America Standard Institute, they emphasised the importance of and need for separate identification of all possible data elements for the structural formulation of the bibliographic record. The study served as the basis for the formulation of the MARC II format by both the Library of Congress and the British National Bibliography. The second, which has been one of the most notable studies concerning the identification of major data elements in bibliographic records, was a research project carried out by Seal, Bryant and Hall (1982) for the University of Bath Centre for Bibliographic Management. By creating a parallel catalogue of short entries, a major aim of the study was to identify which data elements were most used by catalogue users, mainly university students, in order to find and locate items. The study, however, has been seriously criticised for a lack of external validity, in that it was carried out with a limited sample representing only library patrons needs and excluded librarians working in different operations (Svenonius, 1990: 40; 1992: 4). The third study, as indicated earlier, is being carried out by the IFLA Study Group on Functional Requirements of Bibliographic Records.

To achieve the major aim of this chapter, i.e., to identify the content and functions of the bibliographic record, it is necessary to analyse all the areas of bibliographic description as well as some other data fields that have been added to the record in its machine-readable form.

In the following section, a minimum set of data elements will be introduced according to the sequence in which they appear in a machine-readable format. The reason for this approach, as a starting point for the identification and functional analysis of data elements, is that database systems impose a record structure that is different from that of manual systems, i.e., a structure that is broken into fields, subfields and further data elements. As discussed in Chapter 3, this structure and the separate identification and tagging of data elements allow for their inclusion or exclusion as well as their organisation, formatting and reformatting. This is also a requirement for integrated systems, in which a single record forms the basis for different operations.

For the purposes of this chapter, the USMARC bibliographic format has been selected, not as a framework to predetermine data elements but to illustrate the concept of separate identification, categorisation and coding of data elements. The reason for the choice of USMARC (and not, for example, UNIMARC) is that USMARC follows, to a large extent, a specific cataloguing code, i.e., the Anglo-American Cataloguing Rules. This is in line with the general aim of this study which deals with the relevance of cataloguing principles and rules to the online environment. Moreover, almost all discussions in the AUTOCAT list concerning problems in indexing of fields and subfields refer to USMARC. This latter issue will be dealt with in Part 2 of this chapter.

Reference will also be made to relevant values in the Z39.50 standard (for a list of attributes in this standard, see Appendix 2 ) to accomplish the study of the content and functions of bibliographic records in relation to searching and retrieval in an interconnected environment. The Z39.50 standard has identified and coded a list of data elements necessary for searching and retrieval of bibliographic information in remote databases and files.

In the following section, a field-by-field approach to the bibliographic record will be taken to identify data elements deemed to be most important for different functions within the library context and to analyse them from both conceptual and database perspectives. Thus the list of data elements is not intended to be exhaustive. It correspond more closely to the AACR and the required fields in the MARC format. For the purposes of this section, with some modifications to the name of fields in the USMARC bibliographic format, major fields which will be examined here are:

Fixed-length data elements

Standard numbers and codes

Language of the item

Main entry headings (personal names)

Title and statement of responsibility

Edition statement

Publication, Distribution, etc.

Physical Description Area

Series Statements

Notes

1.2.1 Fixed-length data elements (Control field)

(008; values 52-62 in Z39.50)

The USMARC field 008 contains coded information about the record as a whole and about the nature of the item being catalogued. The coded data elements in the field 008 are potentially useful for retrieval and data management purposes (USMARC Bibliographic Format, 1989: 008, p. 3). For all formats, some of the information is: /06 Type of date/Publication status; /07-10 Date 1/Beginning date of publication; /011-14 Date 2/Ending date of publication;/15-17 Place of publication, production, or execution; /35-37 Language. Values 52-61 in Z39.50 are considered essential for similar functions when searching remote databases.

This field also has different meanings depending on the type of material. For books, it contains different values such as: Illustrations; Target audience; Form of item; Nature of content; Government publication; Conference Publication; Festschrift; Index; Fiction; Biography. Such values, which express the nature of works, are important for the user. However, most of them are now redundant since the same information is in a variable field. While the information is not in everyday use, some automation systems (e.g., MELVYL) use the information in the 008, most notably the date, language and country codes, for limiting searches and provide, for example, a listing of all the items published in a certain country in a particular type of publication in a particular year. The same information is also much used for database management (housekeeping) purposes, such as record de-duping (Karen Coyle <kec@stubbs.ucop.edu>, in a posting to AUTOCAT, 12 January 1996).

Publication type (e.g., government publication, technical report, patent, conference publication, festschrift, fiction, biography, dissertation) is an attribute reflecting the nature of the entity. Such information is becoming more important to different users, i.e., librarians, publishers, library suppliers and library patrons. It is a data element that helps the searcher to choose one type of publication over others. Whilst indicating this element in the manual environment was partially done by the addition of the type of publication as a subdivision to subject headings, in the machine-readable record it is also specified in a specific field (008), which is broken up into separate fixed fields, enabling the system to restrict the search to the type of publication specified by the searcher. The type of publication code can be used to select records to print a bibliography of, for example, dissertations.

1.2.2 Standard numbers and codes

(01X-09X and sometimes 500; values 7-20 in Z39.50)

Standard numbers are among the attributes that are usually added to most bibliographic entities at the publishing stage (i.e., the manifestation level) and at the cataloguing stage (the item level). These numbers are nationally and/or internationally agreed upon and are intended to identify books and serials uniquely. The International Standard Book Number (ISBN), field 020 in USMARC and value 7 in Z39.50 and the International Standard Serial Number (ISSN), field 022 in USMARC and value 8 in Z39.50, are two widely used data elements in the bibliographic chain and are assigned to entities by related international bodies. Publishers, booksellers and library suppliers, as well as librarians, frequently use these numbers in different operations to identify and refer to 'specific' publications. In addition to ISBN and ISSN, different control and classification numbers are considered in both USMARC and Z39.50 for identification, finding and locating functions but such numbers are mostly related to a particular database or library.

Some systems allow searching through standard numbers. Since such numbers are controlled codes, retrieval based on them is straightforward (Rowley, 1989: 11). However, it is difficult for library patrons to identify items through such numbers since they are rarely familiar with them. These data elements are also not the type of attributes associated with the entity at the work level. Although they are assigned to items at the item level, they usually do not appear on the front cover or title page where the reader normally looks for key data elements. According to Lancaster and Smith (1983: 43), there are a number of disadvantages with the standard number as a searching element: 1) not all documents have ISBNs (or ISSNs), 2) very few present catalogues allow access via the ISBN, and 3) it is rather unlikely that the end-user of a library catalogue will know the ISBN for a sought item. Since few standard numbers turn out to be sufficiently 'well-behaved', they do not always uniquely identify a single bibliographic entity (Attig, 1989: 143).

In terms of the use of the MARC area for standard numbers as linking devices, Hagler (1991: 232) points out that, although not intended so in theory, the area is sometimes used in practice for recording standard numbers for related items, such as a paperback version or an edition distributed by a different publisher. In the case of serials with variant titles, the ISSN can function as a collocating device. For example, in some cases, standard numbers can be used to link successive titles of serials and create a cluster record to include the separate records for the different titles (Alan, 1993).

Another problem with standard numbers is that they cannot normally be combined, for searching, with other key data elements such as the author heading or title or subject headings. There is no logical link between standard numbers and key data elements such as the author, title, or uniform title.

1.2.3 Language of the item

(041, 500, 546 and also 008/35-37 in USMARC; value 54 in Z39.50)

Language is an important attribute of a work and an essential data element in bibliographical control that serves for the identification of items as well as for restricting search results. It can show the differences between versions of the same entity or different languages in a work. Most publishers and library suppliers include language in their data bases (see Matrix 5.1). Some online catalogues and most bibliographic databases, particularly journal article databases, allow combination of 'language' with other data elements to narrow search results and to find more specific items. For the same reason, language is one of the attributes defined in the Z39.50 standard as a potential element when searching remote bibliographic databases.

Language Code (field 041) is used for works containing more than one language or in the case where the language of the text is different from the language of table of contents, summary or abstract. There are also other cases, such as texts in two or more languages, in which more than one language has to be coded in a record to identify an item better. In short, with the coding of language in 008/35-37 (Control field-language), 041 (Language Code), 500 ( General Note) and 546 (Language Note ) the usefulness of this element has added to the functionality of bibliographic records.

While language is being treated as more important in electronic data bases, current cataloguing codes have a traditional approach toward it; except in the case of uniform titles and notes, language as a data element does not usually appear on the catalogue record in traditional cataloguing.

1.2.4 Main entry heading (personal name)

(Fields 100, 110, 111 and 130 in USMARC; values 1, 2, 3, 1003, 1004, 1005 and 1006 in Z39.50)

Although the block 1XX in USMARC is defined as main entries (personal name, corporate name and meeting name), this section focuses on personal names as a primary and common data element associated with entities and used in different environments.

As an identifying element, the name of the author is one of the most important and widely used data elements in trade lists, catalogues, bibliographies and citations, in that where there is an author's name associated with the work or the item, it forms an integral part of the bibliographic record. As a finding device, the author heading is considered to be an essential element for providing access to the desired item in a catalogue, whether it is manual or online. If the searcher cannot recall the exact title, the name of the author would be a useful search key. From the result of their studies, Meador and Wittig (1991) conclude that 65% of searches in the field of chemistry and 86.66% of searches in the field of economics are done through the author's name. Hufford (1991: 58) reports that 22.9% searches by reference staff were author searches. Reporting the findings of her study of the OCLC bibliographic records, Arlene Taylor (1992: 227) notes that only 5.6% of a random sample of records do not have any personal or corporate names. Name access is an integral part of the bibliographic record.

As a collocating device, the author heading provides access to all works by a given author, so far as it is subject to authority control. In this sense, it is a data element which serves the needs of different users such as the book trade, librarians and library patrons.

As an organising device, the name of the author is a key element in alphabetical catalogues and lists. In this context, it has been one of the most important elements for constructing catalogues, bibliographies, citations and trade lists in the Western tradition. In a computerised system, however, the role of the author's name as an organising element is limited only to name browsable indexes and brief displays in which the search result can be sorted according to author heading, as in the following title search:

AUTHOR TITLE DATE

1 Boissonnade Prosper History of civilization 1927

2 Bowra C M Cecil Maurice History of civilization 1927

3 Burns Arthur Robert History of civilization 1932

4 Childe V Gordon Vere Gordon History of civilization 1934

5 Declareuil J Joseph History of civilization 1925

6 Ghurye G S Govind Sadashiv History of civilization 1927

7 Grenier Albert History of civilization 1935

8 Guignebert Charles Alfred Hono History of civilization 1926

9 Hubert Henri History of civilization 1931

10 Perrier Edmond History of civilization 1957

11 Petit Dutaillis Charles History of civilization 1927

12 Prestage Edgar History of civilization 1925

13 Rivers W H R William Halse Riv History of civilization 1928

14 Summers Montague History of civilization 1924

15 Summers Montague History of civilization 1936

16 Thomas Edward J Edward Joseph History of civilization 1926

17 Vogt Joseph History of civilization 1993

Figure 5.1 Brief display of Author/Title/Date

in the University of New South Wales Library's OPAC

In almost all online catalogues that maintain a brief display, the author heading is an essential element in the display of bibliographic information. Since a title may vary from edition to edition, or the titles of two or more different items may coincidentally be identical, the exclusion of the principal author heading from the brief display (author, title, and date) would obscure the relationship of the retrieved items between themselves and would fail in the finding and identification of the sought item(s). A search under the title 'History of civilization' in large databases would retrieve many records (for example, it resulted in 17 items in the University of New South Wales OPAC (Figure 5.1)). As can be seen, the author heading is needed to differentiate between publications in alphabetical lists and its exclusion from the display would obscure one important element in the further identification of entities.

Names of secondary authors and of other persons associated with the work may be important to different users. In library cataloguing, only the name of the principal author (i.e., the person considered to have primary intellectual/artistic responsibility) is entered in field 100 and the names of other persons are entered in the statement of responsibility (subfield $c) and repeated in access point form in block 7XX (added entries). In other communities, such as publishers and booksellers, the names of all authors and principal contributors are usually treated equally and are entered in one place. This is an area in which the scope of the function of this data field differs between library cataloguing and publishers' practices. Another difference can be seen in the attributes of the author, such as dates of birth and death and his/her affiliation. Library cataloguing considers dates associated with an author to be important, whereas publishers are less concerned with such information.

In short, the name of the persons responsible for the intellectual/artistic content of items is needed for the following purposes: 1) to find works by a given person, 2) to collocate entries for the same person, 3) to organise entries and to differentiate between publications in alphabetical lists such as card catalogues, printed bibliographies, browse indexes for persons and brief displays in online catalogues, 4) to differentiate between items with identical titles, and 5) to identify items for ordering, circulation and interlibrary loan.

1.2.5 Title information and statement of responsibility

(130, 21X-24X, 440, 490, 730, 740, 830, 840, subfield t in the 400, 410, 411, 600, 610, 611, 700, 710, 711, 800, 810, 811; values 4, 5, 6, 33-44 in Z39.50 )

As can be seen in the USMARC format and also from the values assigned to the title, the title is one of the most important identifying and finding attributes of a work and is usually determined at the first stage when a work is created and made ready for publication. While author, subject or any other data element alone does not represent the bibliographic entity (the work or the item), the title is usually a useful data element for expressing an entity. In titling a publication, however, different persons, such as the author, the editor and the publisher are involved or may influence the process. That is one reason why a title may change at different entity levels or additional titles (e.g., cover title, spine title and abbreviated title) may be assigned to the publication. In general, the title of an item may not be the same as the title of the work of which the item is a representation.

The title is one of the most important and widely used data elements in the bibliographic chain and in bibliographic control. Publishers, library suppliers, acquisition librarians, circulation librarians, reference librarians, document delivery librarians and cataloguers use the title for different operations. For example, from his study of three ARL libraries, Hufford (1991: 58) found that 53.1% of searches by reference staff members were title searches. The title also forms one of the most important indexes in national bibliographies and trade lists. In an alphabetical catalogue, the title is an important and useful access point, as an identifier of publications and, usually, as a summary of the content of documents.

While the title has considerable potential in retrieval, it is not, in some cases, a reliable identifier of publications such as in the case of documents with identical titles. For example, as illustrated in Figure 5.1, there are many books with titles such as History of Civilization. In some cases, even a title and an author's name together will not help the searcher to identify and locate the exact document unless a third element, such as an edition or a date of publication or a name of a publisher, is displayed. It would be better if the user used a combination of the title with other bibliographic data elements to specifically search for what is needed.

Another problem with the title is that it may vary in later edition(s), version(s), translation(s), etc. The searcher cannot rely on the words of the title to retrieve all editions or versions of a given work. Variations in the title have been a cause for concern throughout the history of descriptive cataloguing. While a slight change of title may be considered insignificant (see, for example, rule 21.2A1 in AACR2), it may have implications for online catalogues, possibly resulting in a 'no hit' situation. The provision of notes such as 'Title varies slightly' would not help the searcher find the item under variant titles.

The title is a useful element for keyword searching. With keyword access to titles it is possible to retrieve a title when the exact form is not known to the user. Another advantage of title keyword searching is that, in some disciplines such as science and technology, titles usually describe the subject content of publications and so can be used for subject searching. This approach, which has already been applied in manual indexing techniques such as KWIC (keyword in context) and KWOC (keyword out of context), has also been promoted in online catalogues through keyword searching. In stating that titles can form a useful basis for subject searching, Rowley (1989: 11) asks whether AACR2R should have made recommendations concerning the enrichment or expansion of titles.

The title is usually a key element in the arrangement of entries. In online catalogues maintaining a browsable title index, titles should be arranged in their right place with a certain degree of predictability for retrieval. In this context, 'various title information', such as uniform title, other title information, alternative title, parallel title, cover title, spine title, former title, abbreviated title, collective title and expanded title (fields 130, 21X-24X, 440, 490, 730, 740, 830, 840, subfield t in 400, 410, 411, 505, 600, 610, 611, 700, 710, 711, 800, 810; and 811; values 34-44 in Z39.50) can be included in the browsable title index to help the searcher find his/her item of interest.

As a linking device, the title, when indexed as an added entry and as long as it does not change, serves to link related entities. In those cases where a title does not change in the different editions of a work, it can automatically link related works and items. For example, in books such as reference sources and textbooks edited and published at short intervals, a title collocates all the editions prepared by different editors or published by different publishers. In those cases where the title changes in different editions, an added access point for the original title will serve as a means to link different editions. There are other linked title fields (e.g., 78X) and notes to linked titles (e.g., 501) that provide linkage between related titles.

A useful construct in the context of title information is the name of the work or 'uniform title' (fields 130, 240 and 730; value 6 in Z39.50) which is used as a device for bringing together different editions and manifestations of a work and also as a filing element in alphabetical catalogues. As a cataloguing principle that needs to be re-examined in relation to the online environment, the uniform title will be discussed in detail in Chapter 6. As a data element whose inclusion in the bibliographic record enhances retrieval functions, the uniform title has been one of the most important constructs devised in library catalogues. It should be noted that the term 'uniform title' actually encompasses many kinds of titles: including form headings, filing titles, unique titles for serials, collective titles and standard titles (Tillett, 1987:??). While the emphasis on uniform titles was introduced by Lubetzky in the 1960s, some forms (e.g., Bible, Laws) have been in use for a long time.

As a search key, uniform title alone does not make sense in online catalogues any more than it does in large manual catalogues. The difference is, of course, that the next step towards differentiating between entries online is not as immediate as in card or book catalogues where the rest of each entry is immediately viewable. A search in large catalogues under 'Hamlet' will retrieve too many records, for different editions and manifestations, works about Hamlet, as well as works with the title 'Hamlet' written by other writers. (see Appendixes 3.1 to 3.4 for printouts of a similar search in a medium-size catalogue). The problem will be increased when the user searches in a large shared catalogue, such as a national union catalogue, in which there will be a greater number of editions and translations or manifestations of a work held by different libraries.

Although uniform titles are considered important in library cataloguing and a considerable portion of the rules are devoted to them, publishers and booksellers are less concerned about them in their bibliographic work. Now that many of them are constructing and using online/on disk bibliographic databases, providing this data element, particularly for works in literature, music, and law, can be useful in terms of providing more control in identifying and collocating related items and works.

Not only do all the four functions of the title (as a document identifier, a content descriptor, a linking device and an element for the arrangement of entries) remain valid in computer catalogues, but they also perform a more significant role in an online environment. Various title information related to an item can also be useful elements both for representing an entity and as useful points for accessing an item. However, system capabilities/limitations influence their effectiveness.

1.2.5.1 Statement of responsibility (subfield $c in field 245; no value in Z39.50)

Based on the ISBD standard and AACR2, the name of author(s) and other contributors responsible for the intellectual or artistic content of the work should be transcribed (according to the title page information) after the title statement. Although in many current MARC formats the statement of responsibility is not indexed and therefore is not searchable in online catalogues, its major function is to display the link between name(s) and the title in the bibliographic record and thus to further identify and characterise the work. This function of the 'statement of responsibility' element is very important in enabling the catalogue user to appreciate the role of different contributors to the creation of the work, especially in cases of shared and mixed responsibility. The names of other contributors to a work, which appear as added entries, are not usually displayed, particularly in brief and medium displays. The lack of an explicit treatment of this data element in online displays, as will be discussed in detail in Chapter 6, can obscure the description of the entity or even lead to confusion on the part of the user, thus reducing the identifying function of the record.

Another problem with the lack of a proper treatment of all the names in the 'statement of responsibility' area is that, if only the first statement is compulsory in cataloguing and recording of the names of other collaborators is optional, the catalogue would then lose the potential of being able to be searched under added access points. This is in conflict with one of the most important objectives of the catalogue: to show what the library has by a given author. As discussed in Chapter 2, rules, such as 'rule of three' which limit the number of authors to maximum three, are not relevant to an online environment where there are much more storage and retrieval capabilities. There are certainly more benefits for users if we provide additional author/contributor access under statement of responsibility.

A reason for considering as mandatory the transcribing of the name of the author in the 'statement of responsibility' field according to the form in which it appears on the chief source of information (e.g., title page) is in its potential use for free text searching in computerised catalogues. This is particularly useful when the form of name is different from the author heading. In the case of other contributors to a work, free text searching also makes it possible to search on names if they are not indexed as added access points in block 7XX. The 'statement of responsibility' could then be an alternative means for searching and retrieval of items.

It should be noted, however, that the name transcribed in the statement of responsibility may be in a form different from the name on the title page of other publications by the same person. Thus the name in subfield $c in the 245 field cannot be used to collocate all the works/items by a particular author. Neither can it be an organising element for the arrangement of records retrieved in response to a query. For this reason, the name of persons also appear in controlled headings (main and/or added entries) to allow the arranging and collocating functions of the bibliographic record.

1.2.6 Edition information

(250 in USMARC; no value in Z39.50)

The 'edition' area includes the edition statement and any statement of responsibility associated with it. Since information in a work may differ from edition to edition, it is essential that this change be shown to the user. The 'edition statement' usually denotes intellectual rather than physical change of the item being described from other items of similar origin. It is also important in that it is an element for the further identification and characterisation of documents, and is important to various creators and users of the bibliographic record: to publishers, booksellers, acquisitions librarians, cataloguers, reference librarians and, most importantly, end users, in that it expresses the currency of a work. The proper description of the edition information will also prevent duplication of records and will help the searcher to identify the needed item properly. This is crucial for cooperative cataloguing systems and union catalogues that merge the catalogues of many libraries into one database.

In the context of the 250 field, there is little opportunity for the development of linkages between two editions of the same work. It is very important to include edition information in its right place in bibliographic records and display it for the better identification of entities. As will be discussed further in Section 1.2 in Chapter 7, the treatment of edition information needs a new approach in the online catalogue, in that cataloguing principles and rules should address this issue both how to include such information in the record and how to display it. In anticipating the bringing together of different linking devices for the same information, all edition information and its linkages could be incorporated into a single block with tags capable of establishing linkage between related items exactly, as is possible in database systems. According to Laurel Jizba (<20676lj@msu.edu>, in a posting to USMARC list, 9 June 1994), "Maybe the place to locate all edition information is in the note area, coded in a new field next to the 533 note, or maybe the reproduction information needs to move up closer to the 250."

1.2.7 Publication, Distribution, etc. (Imprint)

(field 260 in USMARC; values 31 and 59 in Z39.50)

Information concerning the place(s) of publication, the names of publisher(s) and the date(s) of publication constitutes the 'imprint.' This kind of information is usually attributed to documents at the item level, when an edition of a work is produced and published. The potential uses of imprint data add to the value and functionality of the bibliographic record for both libraries and publishers/booksellers. As identifiers, the data elements in the imprint area are usually more significant to publishers, booksellers, acquisition librarians, cataloguers and reference librarians than to end-users. In many cases the place of publication, the name of publisher, and the date of publication reflect the value, quality and orientation of the document. According to Hagler (1991: 40), "Publication information, like most other bibliographic data, is of value in identifying both the content (the work) and the document." In a sense, the publication information helps the catalogue to fulfil one of its functions perceived by Cutter: to assist in the choice of a book.

The publication information in cataloguing codes has been traditionally devised for description and not for access. In current cataloguing codes and MARC formats, the data elements in the imprint area are not prescribed to be used as access points, organising elements or linking devices. Although in the taxonomy developed by Tillett (1991a), such data elements demonstrate 'shared characteristic relationships' (e.g., common publisher, date of publication and country of publication), the traditional catalogue cannot use them to display bibliographic relationships. In a broad sense, these data elements can be used to search items published in a given place or by a given publisher or in a given time. However, since such a search can result in too many records, this approach is usually not advisable. Instead, each of the data elements in the imprint area can be used in conjunction with other key data elements, such as the author's name and title, to narrow down the results of a search. It should be added, however, that in some cases the publication information is the only clue for users to search for items they are looking for. These data elements have also been used to limit search results in A&I services and some online catalogues: for example, the date of publication has already proved to be an effective element for restricting or sorting search results in A&I services (for example, see Siegfried, Bates and Wilde, 1993: 283). Most online catalogues provide the same facility.

With a separate identification of each data element in the 260 field (and also the 'date of publication' in 008/07-10, 260$c, 046 and 533$d; and 'place of publication' in 008/15-17) it is possible to index these elements for later manipulation in bibliographic databases in the finding, identifying and housekeeping functions. For example, with keyword access it is possible to do further searching on the name of the publisher and find other works by the same publisher. However, recording names such as 'Aust. Govt. Pub. Service' does not help in that regard. In the case of a need to identify publications by a publisher, some systems allow searching on subfield $b in 260.

A problem that may arise from different approaches in the recording of publication information is that a document with more than one place of publication or more than one publisher may result in duplicate records in shared environments since each cataloguing agency follows its own priorities. This problem will not only result in duplicate records in a shared environment but will also mislead the catalogue user "into thinking there are two editions which differ in some significant respect" (Hagler, 1991: 41). Similarly, a lack of uniformity in the recording of the form of names of places of publication and the names of publishers, particularly of government bodies, may lead to the same problem unless suitable matching algorithms exist. Uniformity of names in the Imprint area is important, especially to publishers and booksellers, who do not provide additional points of access to such names as some cataloguing systems do in the field 710.

Of the data elements in the Imprint Area, only 'Date of publication' (value 31) and 'place of publication' (value 59) have been considered in the Z39.50 standard. There is no value for the name of publisher.

1.2.8 Physical Description Area

(fields 300 and 533 in USMARC; no value in Z39.50)

The data elements recorded in this area include the extent of the item and other physical details and dimensions. These allow for the physical description of the item and can help in its further identification, in that they are indicative of the nature of documents. Although such elements are transcribed according to the physical item produced by the publisher and the manufacturer at the item level, they may have roots at the 'work level' and may indicate the nature of the work; for example, the extent of an item and its type of illustrations may be considered as an indication of the depth of treatment of content. In this respect, they are important to the different users of bibliographic records: publishers, booksellers, library suppliers, librarians and sometimes to end users. In some cases, the information recorded in the physical description area assists both the librarian and the reader in their choice of relevant items, thereby fulfilling the 'choice function' of the catalogue. For non-book materials in particular the physical description area is very important to identify the carrier/container (007), which will be important to some users. In general, these data elements are not used in searching and have no finding or collocating function. There is no value attributed to 'physical data elements' in Z39.50.

1.2.9 Series information

(400, 410, 411, 440, 490, 830, subfield $t in 830, 810 and 811 in USMARC; value 5 in Z39.50)

'Series' are defined in AACR2R (1988: 622) as: "A group of separate items related to one another by the fact that each item bears, in addition to its own title proper, a collective title applying to the group as a whole." As the sixth area of ISBD(G), the series area includes the series statement and the statement of responsibility and number(s) associated with it.

Series statements are the type of data elements that are usually attributed to the bibliographic entity at the publishing stage, i.e., at the item level. Series information has a three-fold function: 1) it further characterises the item and indicates the orientation of the work by showing a specific subject area or the interest of its publisher, 2) it collocates items in the same series whether by the publisher or the author or the topical area, and 3) it can be used in the searching and identification of items. It is equally important bibliographic information for publishers, booksellers, and librarians and end users. In the library context, the information in a series statement is useful to acquisitions librarians, cataloguers and reference librarians as well as to library patrons for the identification and assembling of items in a group, i.e., whole/part relationships.

As a linking device, series statements maintain the whole/part relationships (see Table 4.1 in Chapter 4). The approach is especially useful in searching publications of corporate bodies, because the name of a body usually expresses its field of activities and its mission-oriented responsibilities. However, a problem with series information is that, in some cases, the degree of topical connection among items is not bibliographically relevant (Hagler, 1991: 53). This implies that, in some cases, the series title is not to be considered as an access point and need not be indexed. For example, some publishers' series which bear only the name of the publisher, without any indication of the content of the items in the series, are not useful elements.

In a manual environment, the information in series statements is searchable only through added entries under the exact form of the series title and also through name/title cross-references provided for them. In a computerised catalogue, series information can be searched not only through added access points (field 800 for personal name, 810 for corporate name, 811 for meeting name, 830 for uniform title), but also through keywords both in series statements and series added entries. Since it is difficult for the user to remember the exact wording of series statements, keyword searching can help retrieve such information.

The wording and order of the series title, followed by the related number within the series, followed by the title of any subseries and, finally, the related number should be treated in a uniform manner so as to permit easy collocation of items in the same series. Series information, particularly for online catalogues, should be subject to authority control in such a way as for it to be searchable.

In terms of searching and retrieval of series information in remote databases, the Z39.50 standard requires value 5 for series title.

1.2.10 Notes

(5XX in USMARC; values 62 and 63 in Z39.50)

Notes are additional bibliographic information (e.g., extended physical description, relationship to other works or contents) that are recorded for different purposes. Sometimes notes are regarded as indispensable; they included as justification for added entries (see rule 21.29F in AACR2R). Some notes are included because the cataloguer considers them important for further identification of the item or for displaying its relationships to another work.

Over the past two centuries, the notes area has embodied various as well as more types of information. A comparison of the type and amount of notes information on a catalogue card with different fields in the 5XX block in the USMARC format shows the variety of notes (more than 50 different coded notes) that perform different functions, especially the identifying and relating functions: 'general note' (field 500), 'with note' (field 501), 'dissertation note' (field 502), 'bibliography note' (field 504), 'formatted content notes' (field 505), 'summary, abstract', 'annotation and scope notes' (field 520), 'target audience note' (field 521), 'reproduction information note' (field 533) and 'local note' (field 59X) are some important kinds of notes in the USMARC format. Only the value 62 for 'Abstract' and the value 63 for 'note' have been assigned to notes information in the Z39.50 standard. Not only do notes provide better description and better ways for users to identify the contents of items but they can also facilitate the fulfilment of the catalogue's objectives, especially in a network environment in which the searcher has no direct access to the physical item. In general, there is overall support among cataloguers for adding more extensive notes to bibliographic records (see, for example, different postings to AUTOCAT, 10-16 November 1994). However, library administrators more typically want to eliminate notes, for example, those that are included merely to justify an added entry or are redundant with other information or access fields.

Notes include both relationship and non-relationship information. Notes containing relationship information are of various types: for example, notes about title variation and earlier edition(s), notes related to an immediately preceding or succeeding title, notes to acknowledge an equivalent copy, notes to acknowledge the 'original', notes about the described item on the analytical entry and notes of minor accompanying matter. In a sense, notes more than any other data elements demonstrate different types of relationships. Descriptive relationships, which hold between a work and a description, criticism, evaluation or review of that work, can only be found in the '500' general notes. Reporting the results of her study of the catalogue of the Library of Congress, Tillett (1992c: 182) found that 70% of the records with '500' notes were records with relationship information. Her study also revealed that every category of relationship, except the shared characteristic relationship, is represented in the general notes.

In their present structure, notes do not maintain a practical or automatic linkage between related records. According to Tillett (Ibid: 183), some types of relationships, such as sequential and whole-part relationships, are more likely to be shown by explicitly coded fields while other types, such as equivalence, derivative, descriptive and accompanying relationships, are more likely to be embedded in a general note. This disparity is also apparent in the MARC format. This is a case in which the structure of the MARC format inhibits the effective retrieval of the full content of bibliographic records. Many links to related records may be coded in a more explicit and structured format in the notes area. Tillett (1989a: 161) proposes that types of relationships be tagged and reflected through notes. This will avoid the need for a redundant tracing in a machine-readable record. To achieve this, Tillett (Ibid: 161) points out: "We would need to slightly modify some of the MARC tags and indicators for notes which incorporate links to another bibliographic record."

In a traditional sense, notes created according to current rules are mainly descriptive and are not structured for effective computer retrieval. It is not possible to search notes information via controlled access, unless they have been provided as added entries in blocks 7XX and 8XX. Even in formal notes in which an invariable introductory word or phrase or a standard form of words is presented (AACR2R rule 1.7A3), the aim is primarily for further description and not for access. If notes are to be searched as a part of the bibliographic record, they should be described and coded in a way that will allow the system to retrieve them. Duke (1989: 123) points out that:

However, if the contents note is also a point of analytical access to the record, transcribing the names of authors and complete titles as they appear on the publication would enhance the potential for retrieval.

The USMARC format allows for specific subfielding of the 505 content notes, but many libraries consider this unnecessary and laborious.

Since the information in the Notes Area can be retrieved through keyword searching, the transcription of some data elements, such as titles and personal names, requires a high degree of uniformity and fullness. Whether the forms of names as they appear on the title page (or the chief source of information) are sufficient for effective searching and retrieval, or whether they need to be subject to authority control is a question that needs to be dealt with in a uniform approach. Whether there also needs to be a standard approach to the number of authors that should be named and whether subtitles should be included in tables of contents is a matter of local versus network preferences. This problem has significant implications for the design of the MARC format as well as for cataloguing codes. In relation to the content of notes for potential retrieval, Duke (1989: 123) also states that:

As long as the contents note was used only in a display as a block paragraph on the catalog record, there was no need to precisely delineate its elements; however, to assist the computer in manipulating contents information in retrieval a formal structure of tags and subfields is necessary.

In terms of the requirements of the bibliographic record in an online catalogue and with regard to potential keyword access to notes information, it is desirable that notes be provided in such a form and language that their content can be retrieved and accessed effectively. In terms of providing access to and collocation of related works or items, different types of notes require different treatments. There would be a need to have relevant rules about how to make notes and to provide relationship information (i.e., links) in machine-readable records. This is one of the most important areas cataloguing codes and MARC formats could contribute to the effective fulfilment of the collocating function of the catalogue.

In summary: notes are considered necessary both for further description and for better identification of an item and also for a more explicit demonstration of bibliographic relationships. Although some librarians believe that bibliographic records should be briefer than they are now and that notes, in particular, if not displayed effectively, should be reduced in number, notes are at present considered to be more important than they were in the past and will probably have a more significant role in the future structure of records. With the capability of online systems to maintain large amounts of data on a record and with various search/retrieval/display capabilities, it is possible to input more data elements as notes to a record. Notes can perform more functions in an online environment where the remote user has no physical access to the items.

1.3 FUNCTIONAL CATEGORISATION OF DATA ELEMENTS

Based on the discussions in the preceding section and according to their roles in fulfilling various functions in different environments, data elements in the bibliographic record can be categorised into the following five groups as follows::

1) 'finding' data elements: attributes that are necessary for searching and retrieval of known items. They provide access to the text of records. Such data elements include author headings, titles proper, other title information and standard/control numbers such as ISBN, ISSN and LC control numbers..

2) 'identifying' data elements: those that help in the further characterisation of the work/item so that the searcher can distinguish a particular edition/manifestation over other editions/manifestations of the same work. These elements are: author headings, title information, edition information, series information, language, readership level, summary/abstract, type of publication, genre/form, place of publication, name of publishers, dates of publication and country of publication.

3) 'organising' data elements: these elements are restricted to attributes that help in the arrangement of records, file indexes and in the sorting and limiting of the search results when more than one record is retrieved. Author headings, titles proper, dates of publication, call numbers and record numbers are among the most important organising data elements.

4) 'relating' elements: those attributes that show the relationship of the work/item to other related works/items. These elements may include author headings, uniform titles, edition statements, series information, and some kinds of the notes information.

5) 'locating/accessing' data elements: attributes that provide physical access to copies of the item. Call numbers, location numbers, the names of library(ies), in case of union catalogues, are major accessing elements.

Data elements may perform other functions that are very important in computerised systems. In addition to housekeeping functions (for database management such as detecting duplicate records), some data elements can be used in narrowing the scope of a search. For example, date of publication, language of the work, readership level, country of publication, genre/form and document type are very much used by different users to make searching more specific. This is an area that should be taken into consideration in any analysis of the functions of data elements in bibliographic records and, consequently, in the re-examination of the catalogue's functions.

Matrix 5.2 and Figure 5.2 demonstrate a categorisation of data elements according to the above types of function. The bibliographic record as a combination of data elements structured according to standards for different operations is the basis for carrying out the required functions.

As can be seen in Matrix 5.2, each data element may perform more than one type of function; indeed, one single type of function may be carried out through different data elements. Almost any data element serves to fulfil the identifying function more than other functions. The least function being fulfilled is the locating function which can be carried out only through the call number. It is interesting to note here that current cataloguing rules are less concerned about the locating function. Control fields, as can be seen, have identifying and organising functions only and are usually used by librarians. Notes are not used for finding purposes. They can be used mostly for further identification of entities and in some cases for displaying the relationship of the item to other items (see Section 1.2.10).

As discussed in Section 1.2.2 in this chapter, the MARC area for standard numbers such as ISBN and ISSN is sometimes used as a linking device to relate, for example, paperback version or an edition distributed by a different publisher. In case of serials with variant titles, the ISSN can be used to link successive titles of serials and link separate records for different titles.

In general, while descriptive data elements are more common to different users in the book world and in libraries, organising elements may differ in the two environments as a result of different objectives and functions. Nevertheless, in a variety of environments, encompassing different material types and varying degrees of focus on data elements, it is difficult to develop a categorisation that is both totally exhaustive and mutually exclusive.

Figure 5.2 shows that data elements can be grouped in three major categories: 1) control fields including data elements such as standard numbers, language, type of publications and country of publication, 2) description including descriptive data elements such as those recorded in the ISBD description, and 3) headings or controlled access points including name headings, uniform titles, and subject headings. The lines which link these categories or packages show that there is a close relationship between them, in that they constitute the structure of the bibliographic record in an automated system. The reason for such a categorisation for a record structure is that the data elements in each box are prescribed and controlled according to different standards. The first category is prescribed and formulated by the MARC format. The second category is recorded according to the ISBD standards and the third category is controlled according to name and subject authority lists. As can be seen, the three categories or packages of data elements perform almost similar functions. However, as mentioned in chapters 2 and 3, with respect to the treatment of data elements there are some inconsistencies and overlaps between cataloguing rules, ISBDs and MARC formats. This is a source of problem in the proper functioning of the bibliographic record. Any inconsistency or overlap between cataloguing standards influences the different functions of the bibliographic record. As an example, the inconsistency in the indexing of cataloguing data will be discussed in the next part of this chapter.

Matrix 5.2 Functional categorisation of data elements in the bibliographic record


                        Finding   Identifying Organising  Relating  Locating 
Functions               function   function   function    function           
Data elements                                                       function 

Control field:                                                               
     Publication type             x           x                              
     Language code                x           x                              
     Dates                        x           x                              
     Country code                 x           x                              
Standard numbers:                                                            
     ISBN               x         x                       x                  
     ISSN               x         x                       x                  
     Call number        x                     x           x         x        
Language of item                  x                                          
Author heading          x         x           x           x                  
Title information       x         x           x           x                  
Statement of            x         x                                          
responsibility                    x                       x                  
Edition statements      x         x                       x                  
Place(s) of             x         x                       x                  
publication             x         x           x                              
Name of publisher(s)              x                                          
Date of publication     x         x           x           x                  
Physical description                                                         
Series information                x                       x                  
Notes:                            x                       x                  
       Content note               x                                          
                                  x                                          
Summary/abstract        x         x                       x                  
       Target audience  x         x           x           x                  
       Reproduction                                                          
note                                                                         
Added access points                                                          
Genre/form                                                                   




              CONTROL FIELDS                      ------- to find   
ACCESS        standard numbers and codes (e.g.,   ------- to        
(controlled)  ISBN, ISSN, LCCN, publishers' no,   identify          
              national library no., etc.),        ------- to        
              language, type of publication,      relate            
              country of publication              ------- to        
                                                  organise/         
                                                                    
                                                  sort/limit        
                                                  ------- to        
                                                  access/           
                                                                    
                                                  locate            




               DESCRIPTION                        ------ to find    
               title and statement of             ------ to         
DESCRIPTION    responsibility,                    identify          
&              edition statement,                 ------ to select  
ACCESS         imprint information,               ------ to relate  
(uncontrolled) publication, distribution                            
               information,                                         
               physical description information,                    
               series information,                                  
               notes information                                    




              HEADINGS (controlled access         ------- to find   
ACCESS        points)                             ------- to        
(controlled)  authors/contributors' name,         identify          
              uniform titles,                     ------- to        
              series titles                       relate            
              subject headings                    ------- to        
                                                  organise/         
                                                                    
                                                  sort/limit        



linked to:


AUTHORITY FILES:         
Name authority file      
Uniform titles           
authority file           
Series authority file    
Subject authority file   



Figure 5.2 A functional model for the bibliographic record in its electronic format

PART TWO

A UNIFORM APPROACH TOWARDS THE INDEXING OF CATALOGUING DATA IN THE BIBLIOGRAPHIC RECORD

2.0 Introduction

The way in which data elements are correctly stored and manipulated is very important for the various functions of the bibliographic record and to the objectives of the catalogue, especially in a global online environment. In relation to the requirements of the bibliographic record for searching, retrieval and display, agreement needs to be reached nationally and internationally regarding the coding of certain data elements to allow for their indexing. This is a necessary requirement and a principle for large catalogues and for databases in shared environments. The treatment of some fields and subfields for indexing and display does not, at present, follow a uniform approach.

Part of the trouble lies in the fact that, unlike cataloguers in the manual catalog environment, cataloguers now have less control over some of the cataloguing processes, such as the manipulation and display of cataloguing data. There is no guidance in current cataloguing codes for the handling of cataloguing data in online systems, such as the consistent indexing of fields and subfields for their searching, retrieval, display and sorting. Another reason is that database vendors provide record formats which may not be flexible in terms of indexing of cataloguing data. Gildemeister (<eeglc%cunyvm.bitnet@ubvm.cc.buffalo.edu>, in a posting to AUTOCAT, 8 September 1995) notes that online systems are designed by vendors for what people "want", i.e., not necessarily what cataloguing standards require.

In this part some of the problems concerning lack of consistency in indexing of cataloguing data will be explored in relation to the uniform title, title proper, series information and notes information as examples of indexing problems in online catalogues. Such problems have often been a cause for discussion among cataloguers working in automated environments. A review of postings to AUTOCAT and USMARC lists indicates that a considerable amount of discussion is concerned with problems associated with the indexing of fields and subfields.

2.1 Indexing of uniform titles

As an example, a problem of some automated catalogues inhibiting access to records through uniform title as main entry is that, in systems where each field in the machine-readable format is indexed in a separate file, it is difficult for the catalogue user to distinguish which index, e.g., author, title, etc., should be consulted for that particular access point. The discussions by different cataloguers in AUTOCAT (9 June to 11 August 1994 and also 8-9 September 1995) concerning different aspects of uniform titles in USMARC and their implications in online catalogues illustrated how different automated systems have different approaches to this problem. In some systems, uniform titles (e.g., fields 130, 240) are not indexed as titles but are indexed as added entries (e.g., 730, added entry--uniform title). On the other hand, some systems never duplicate the 130 or 245 (title statement) in 730. Coral (1992: 30) points out that:

The fact that the first uniform title for a bibliographic record is carried in a separate, but unlinked, field in many MARC formats, while all subsequent uniform titles are carried in linked fields, presents indexing problems for most computer systems.

In a USMARC record the first uniform title is carried in the 240 field and its related author in the 1XX field. These two fields are not linked to one another. All subsequent uniform titles with authors are carried in 7XX fields with the author in the subfield a and the title in the subfield t. Thus the author and title are linked together. Most systems do not find it easy to keep the 1XX and 240 linked together in their indexing for some reasons.

In the context of some automated systems uniform titles are indexed as authors. For example, while one would expect 'Bible' or 'Arabian Nights' to be treated as if they were titles and to find them in the title index (due to the related fields in the USMARC format, for example, fields 130, main entry--uniform title and 240, uniform title), these titles are indexed as authors. Studying the problems of uniform title as author, Sanders (1987: 236, 237) states that treating uniform title main entries as author does still occur and inhibits access in some automated systems.

All these problems indicate that there is no consistent approach toward the indexing of uniform titles in online catalogues and that this divergence makes the input of accurate uniform titles into shared databases, and consequently, online retrieval and display difficult and, in some cases, impossible. As a possible solution to such an indexing inconsistency, Z39.50 assigns 'value 6' as the single field for uniform titles (see Appendix 2: List of Attributes from the Z39.50 Standard). In all systems using this standard the searching and retrieval of uniform titles will be carried out in a consistent way.

2.2 Indexing of titles proper

It is very important in a network or shared environment that a uniform approach be taken to the coding and indexing of various elements of title information. For example, due to different software specifications and varying approaches towards the indexing of 'title proper' ($a) and 'other title information' ($b), the same document may be retrieved in separate indexes in different systems. Some systems index both the $a and $b. They break the title at the earliest possible break and allow searchers to retrieve it under either 'title proper' or 'other title information'. Other systems index only titles proper ($a) and do not provide access to other title information. The treatment of 'other title information' as added entries and indexing them in 730 may cause further problems in online catalogues. If only some libraries index subtitles, the user will assume that they are the only ones that own those titles. If only titles proper are indexed, they may not be distinguishable from similar titles in the resultant brief display format.

Another problem in online retrieval of titles in large databases is that, in some systems, the default index for searching titles is the keyword index. This would result in too many hits. For example, a search (carried out on 12/10/1995) in the RLG (Research Libraries Group)'s database under the title of 'History of Civilization' resulted in 5790 hits of which only a small number of records had the title proper of 'History of Civilization'. The search was, in fact, a default title keyword search.

The Z39.50 standard assigns separate values for different title information: 'value 4' for title proper, 'value 5' for series title, 'value 35' for the title proper in another language and/or script, 'value 39' for the running title, 'value 40' for the spine title, 'value 41' for a variation from the title page title appearing elsewhere in the item, 'value 42' for former title, 'value 43' for shortened form of title, and 'value 44' for an expanded (or augmented) title (see Appendix 2: List of Attributes from the Z39.50 Standard).

2.3 Indexing of series information

As in the case of titles proper and uniform titles, a difficulty in searching 'series information' in an online catalogue is that the searcher may not have a clear understanding of the index that should be searched for this information. The problem is that there is more than one type of series statement: subject series, commercial publisher series, corporate body series and author series, each of which requires a different solution. In addition, there may be a combination of two types of series; for example, publisher and subject. This has caused the treatment of series information in machine-readable records to be problematic, leading in some cases to catalogue user confusion. As indicated in Section 2.2, the standard Z35.50 requires 'value 5' for series title.

2.4 Indexing of notes information

The treatment of the field 505 (formatted content notes) is another example of the problem in indexing for searching or display: some systems are not set up to handle the indexing of subfields such as subfield t in the 505 field, which allows the system to limit keyword searches to titles only and subfield r to authors only. In relation to the indexing problems of 505 Cynthia Watters (<WATTERS@myriad.middlebury.edu>, in a posting to AUTOCAT, 24 May 1995) wrote:

We are redoing our keyword indexing table, and the question comes up, do we want to index the 505 $t in the keyword title index or, as has previously been done with the 505 $a, index it in the keyword notes index. If we want to index it in the title, what do we want to do with the $a? Or, put another way, if we index the 505 $t as title, it means two areas to specify to retrieve all titles in 505 fields--both the title fields and the note fields. Along with the question of title is the question of author. Do we want to index the new 505 $r in the keyword author index or the keyword notes index? Author seems potentially more tricky because, unlike the "regular" author fields, the 505 $r will have the name in direct order. Thus if we index it as an author, the searcher may have to specify something like: Doe John or John Doe (our system treats strings as strings; you could say John and Doe, but if you say John Doe it will look for them in that order; you can specify proximity, but this all gets more sophisticated than most users).

The relationship between 505$t and 740s is also important for the efficient indexing of titles. At present, it is a function of how the system indexes. While the coding of titles in 505 as $t is done in some systems to allow their indexing, the addition of 740s may result in double hits in retrieval. Some systems (e.g., MultiLIS) index the 505 up to the / in the title index, so making the 740 redundant (Rebecca Thompson <thompsr%snypotvx. BITNET@UBVM.CC.BUFFALO.EDU>, in a posting to AUTOCAT, 22 May 1995). In other systems, for example, Geac Advance, the 505 field can be indexed into a notes searchable category which is separate from the 'title' searchable index. It is said that this approach "makes it easier to distinguish between a title coming from a 245/246/740 and a title from a 505" (Mitch Turitz <turitz@mercury.sfsu.edu>, in a posting to AUTOCAT, 23 May 1995). In summary, as the discussions about the issue on the AUTOCAT (22-29 May and 1 December 1995) indicate, there is at present no consistent approach in the indexing of 505$t and added title entries in the 740s.

A similar problem can be seen in the different approaches by different cataloguing agencies of the indexing of 'Dissertations' or 'Theses' information. Since access to such information in academic libraries is important to many users, in that they want to limit searches to theses or there may be a need to print out a list of theses available in a collection, many libraries provide additional access to theses. However, as different postings to AUTOCAT on the question of 'Theses' (20-26 February 1996) show, this type of data is indexed differently by cataloguers in various fields (such as fields 500, 502, 533, 610, 650, 655, 690, 710 and 830) with a lot of redundancy.

Z39.50 assigns a single field (value 63) for different note information, such as extended physical description, relationship to other works, or contents. As can be seen, this approach cannot fully respond to the indexing, and consequently, to the searching and retrieval of different types of information in the Notes area.

2.5 A possible solution

As a promising solution to the present inconsistencies in the indexing of data elements in online catalogues, the approach in the Z39.50 standard can be followed by automated systems. This is particularly important because of the increasing use of WWW front-ends to library catalogues and other information systems. The Z39.50 standard standardises the indexing of different attributes. Each attribute has a given value (see Appendix 2: List of Attributes from Z39.50) so that any system using the standard can index the cataloguing data in a consistent way.

The user can use the local system to search the local catalogue, local CD-ROM databases of MARC records and also other library catalogues. The Z39.50 standard allows the user to search other catalogues as if he/she is searching the local catalogue with familiar search techniques. In essence, Z39.50 provides a consistent search interface to bibliographic databases.

2.6 Conclusion

In terms of networks and shared environments, the tagging of fields and subfields for improved indexing should follow a more uniform approach so that retrieval and display would be more efficient. What a local system is set up to do may not be in line with the requirements for searching and retrieval by remote users. Existing systems may not actually be capable of such indexing but the potential exists. MARC formats should take advantage of enhanced indexing and create records that not only meet today's system needs and present software but also will meet future enhancements and be portable to other softwares. The Z39.50 standard is a promising approach towards a consistent handling of data elements for searching and retrieval in networked environments. It can help to remove problems possibly arising from the indexing differences between databases.

SUMMARY AND CONCLUSIONS

The bibliographic record in its machine-readable format is very different from the traditional, manual record. The electronic environment has provided catalogue records with many opportunities and potentialities to carry out more functions in a variety of operations in bibliographic work.

With the advent of online catalogues, the identification of data elements for inclusion in bibliographic records, which have gradually developed alongside the technology of catalogue construction, has reached a new era. To carry out different functions more effectively in the online environment, bibliographic records should be created with regard to a number of factors:

--With ever-increasing developments in information technology and telecommunications and, consequently, ever-increasing availability of bibliographic files online or on-disk, an increased potential exists for bibliographic records to be created and used cooperatively in a variety of environments. This would result in more standardisation as well as economy in record creation.

--In response to the various needs of different users, the bibliographic record will require a minimum set of data elements included for carrying out different functions such as the finding function, the identifying function, the relating function, the organising function and the choosing function. The increase in the number and type of data elements that make up electronic records illustrates the variety of their uses in the online environment. It can be concluded that inclusion or exclusion of data elements depends directly on the functions of the record and that reducing the level of description may lead to a great loss of functionality of the bibliographic record. The more functionality we expect from the bibliographic record, the more data elements would be needed for inclusion.

--The way in which cataloguing data are indexed and tagged in automated systems influences the functionality of bibliographic records. In library cataloguing and for optimal functionality of bibliographic records, the indexing of fields and subfields should follow a uniform approach. This would maintain effectiveness in searching, retrieval and display of bibliographic information both within systems and between systems. In this context and in terms of the identification and handling of data elements, cataloguing standards (codes, MARC formats and the Z39.50 standard) should be brought closer, in that they should provide guidelines for the designation of data elements for machine-readable records. If the rationale of cataloguing principles is to bring uniformity in bibliographic description and effectiveness in access, they should also address the question of uniform approaches to the indexing of cataloguing data.

In summary, developments in the electronic handling and in the exchange of bibliographic data have had, and will continue to have, significant effects on the bibliographic record. A major impact of such developments on catalogues is that the bibliographic record in its machine-readable format can enhance data functionality beyond that found in manual records. Various functions of the bibliographic record in the online environment are carried out through data elements whose choice and content are determined by cataloguing principles and rules, an issue which constitutes the content of the next chapter.


Back to: