880 indexing for CJK: creates indexes for all records, even those without 880

Bug #1371101 reported by Suzanne Paterno
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
New
Undecided
Unassigned

Bug Description

880 indexing for CJK: creates indexes for all records, even those without 880

We have added indexes in config.metabib_field to add indexing for various fields represented by the marc 880 field. What we have discovered is that every record, even those without 880's, get an entry in the index table corresponding to each index we have created.

For title we created the following index in
config.metabib_field :
id = 1001
field_class = title
name = titleproper880
label = TitleProper880
xpath =//marc:datafield[@tag='245']/marc:subfield[@code='a' or @code='b']
weight = 1
format = marc21expand880
search_field = true
All the remaining values are false/null

---
Example Record with an 880/245
http://evergreen.noblenet.org/eg/opac/record/2421719

index created in metabib.title_field_entry
source = 2421719
field= 1001
value = Beijing de si miao Temple of Beijing / 北京的寺庙 Temple of Beijing /
---

---
Example Record with NO 880 fields – still receives an index for TitleProper880
http://evergreen.noblenet.org/eg/opac/record/2363729

index created in metabib.title_field_entry
source = 2363729
field= 1001
value = Pretty girl gone /
---

I transformed the marcxml for these two records from biblio.record_entry into the marc21expand880 version using the xsl stored internally in config.xml_transform.

Looking at the marc21expand880 stylesheet and the resulting files made clear the problem. The “transformed” file has all the original marc fields in it. It does produce new datafields created by the xsl template for the 880s. The new 880 nodes are given a name space of marc <marc:datafield>. However marc is the same namespace as entire file. That means when the xpath is evaluated both the original field(245) and the new <marc:datafield> node (880/245) created from the 880 are found. This can be seen in the resulting index examples shown above. The Chinese title has both the English 245 and the Chinese 880/245 for its value in its index in title_field_entry table. The English only title has a value of just the English 245, where it should have none.

I looked at modifying our index xpath to only look for those nodes created from the 880s, however there is nothing unique enough about them to separate them from the original marc fields.

I did some experimenting on our training system with the TitleProper880. Giving the transformed 880 fields a different name <marc:datafield880> and modifying the indexes to match produced the results we expected. Only the records with an 880/245 created an index in the metabib.title_field_entry.

How can we correct this problem? Is the problem with our indexes or the internal stylesheet? Can it be corrected without modifying the internal xsl marc21expand stylesheet?

I have attached the marc21expand stylesheet extracted from xml_tranform.
Also I’ve included the original marcxml from the bib (beijing.xml pretty_girl_gone.xml) and the transformed xml files (beijing880.xml, pretty_girl_880.xml)

Tags: cleanup
Revision history for this message
Suzanne Paterno (paterno) wrote :
Revision history for this message
Suzanne Paterno (paterno) wrote :
Revision history for this message
Suzanne Paterno (paterno) wrote :
Revision history for this message
Suzanne Paterno (paterno) wrote :
tags: added: cleanup
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.