Wikisource:Portal classification system adaptation
|← Wikisource:Essays||Portal classification system adaptation|
|A guide to the adaptation of the Library of Congress Classification system (LCCS) to Wikisource, for use in the Portal namespace, and a subsequent log of changes.|
This page describes the method by which the Library of Congress Classification system (LCCS) was adapted to Wikisource and the subsequent changes to that system.
In addition to casual interest, this page can be used to:
- Explain the differences between the two systems and solve related problems with the classification of works.
- Solve any problems that may occur when cross referencing the original and adapted systems.
- Provide a basis for future versions of the system, if necessary, or for incidental changes.
- Provide a blueprint if it is necessary to revert or repair parts of the system.
The outline of the Library of Congress Classification system can be read in full on Wikisource at Library of Congress Classification; which includes a search function to find topics and their classifications. It can also be read at the Library of Congress's website. More information can be found via the Wikipedia article.
The LCCS system uses 21 classes, each represented by a letter of the alphabet. Each class is broken down into subclasses, represented by one or two further letters of the alphabet. More specific classification then uses a number from 1 to 999, which is itself followed by a cutter number.
Wikisource's portal space was not initially organised in any way. Most indices were, at that time, in the Wikisource namespace, and also not organised (leading to duplication in some cases). It was desired that any subject index for Wikisource, which is essentially the purpose of portal space on the project, should use a published and authoritative system rather than something homegrown. This also has the advantage of being complete and having been already tested in practice.
The Dewey Decimal system qualifies under these terms but it is largely under copyright (the original version is in the public domain but it has gaps where reference to the modern world would be needed; adding these will either cause confusion or risk copyright infringement). LCCS is a work of the United States government and therefore in the public domain.
As this classification system is for use with portals and not individual books, the level of detail in the original system in unnecessary. So, the numeric portions of the call numbers were dropped, leaving just the subclass.
For example, the official Library of Congress call number for On the origin of species by means of natural selection by Charles Darwin is QH365.O2. On Wikisource, any portals corresponding to this book, and those like it, would be classified by just the subclass, the initial alphabetic portion of this call number, or QH.
Further, as explained below, during implementing the system, it needed to be adapted to fit Wikisource's specific needs. Two more classes (I & X) and several more subclasses were added, while one class (E) was slightly reinterpreted to fit pre-existing material. This is justified as an existing practice when adapting the LCC system to local or specific use. Wikipedia notes that "The National Library of Medicine classification system (NLM) uses the classification scheme's unused letters W and QS–QZ. Some libraries use NLM in conjunction with LCC, eschewing LCC's R (Medicine). Others prefer to use the LCC scheme's QP-QR schedules and include Medicine R."
The subclasses as they exist are not all directly usable as portals on Wikisource. The most extreme example is Subclass PT: German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature. The directly equivalent portal, Portal:German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature, is impractical for use on Wikisource. Therefore, where this problem occurs, the Library of Congress Classification system subclasses have been divided into new subclasses. The first term of each retains the old classification; subsequent terms add a third letter to create the new classification.
For example, Subclass PT: German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature becomes:
- Subclass PT: German literature
- Subclass PTA: Dutch literature
- Subclass PTB: Flemish literature
- Subclass PTC: Afrikaans literature
- Subclass PTD: Scandinavian Literature
- Subclass PTE: Old Norse Literature
- Subclass PTF: Modern Icelandic Literature
- Subclass PTG: Faroese Literature
- Subclass PTH: Danish Literature
- Subclass PTI: Norwegian Literature
- Subclass PTJ: Swedish Literature
Note: Subclass AC is an exception to this pattern. There was an existing Collective works index when this system was implemented, so this was used as the first term instead of Collections, which became the second term.
This is more complicated within Class K, Law, which is explained separately (below).
Some classes contain subclasses with the same code as the class itself. For example, subclass P (Philology and Linguistics) within Class P (Language and Literature). These are represented by a non-alphabet symbol as the second letter of the classification. The examples of this here use an asterisk; however, this causes problems in practice as the wikicode interprets this as a bullet point in some cases. Any symbol or number can be used in its place instead (for example, a hyphen).
|Original classification||New classification|
|AC||Collections. Series. Collected works||ACA||Collections|
|AG||Dictionaries and other general reference works||AGA||Reference Works|
|AM||Museums. Collectors and collecting||AMA||Collectors and Collecting|
|AY||Yearbooks. Almanacs. Directories||AYA||Almanacs|
|AZ||History of scholarship and learning. The humanities||AZA||The Humanities|
|BL||Religions. Mythology. Rationalism||BLA||Mythology|
|BP||Islam. Bahaism. Theosophy, etc.||BPA||Bahaism|
|CD||Diplomatics. Archives. Seals||CDA||Archives|
|DB||Austria - Liechtenstein - Hungary - Czechoslovakia||DBA||History of Liechtenstein|
|DBB||History of Hungary|
|DBC||History of Czechoslovakia|
|DC||France - Andorra - Monaco||DCA||History of Andorra|
|DCB||History of Monaco|
|DG||Italy - Malta||DGA||History of Malta|
|DK||Russia. Soviet Union. Former Soviet Republics - Poland|
|DKB||History of the Former Soviet Republics|
|DKC||History of Poland|
|DL||Northern Europe. Scandinavia|
|DP||Spain - Portugal||DPA||History of Portugal|
|G*||Geography (General). Atlases. Maps||G*A||Atlases and Maps|
|GA||Mathematical geography. Cartography||GAA||Cartography|
|GF||Human ecology. Anthropogeography||GFA||Anthropogeography|
|HB||Economic theory. Demography||HBA||Demography|
|HD||Industries. Land use. Labor||HDA||Land Use|
|HN||Social history and conditions. Social problems. Social reform||HNA||Social Problems|
|HQ||The family. Marriage. Women||HQA||Marriage|
|HT||Communities. Classes. Races||HTA||Classes|
|HV||Social pathology. Social and public welfare. Criminology||HVA||Social and Public Welfare|
|HX||Socialism. Communism. Anarchism||HXA||Communism|
|JS||Local government. Municipal government||JSA||Municipal Government|
|JV||Colonies and colonization. Emigration and immigration. International migration||JVA||Emigration and Immigration|
|PA||Greek language and literature. Latin language and literature||PAA||Latin Language And Literature|
|PB||Modern languages. Celtic languages||PBA||Celtic Languages|
|PD||Germanic languages. Scandinavian languages||PDA||Scandinavian Languages|
|PG||Slavic languages and literatures. Baltic languages. Albanian language||PGA||Baltic Languages|
|PH||Uralic languages. Basque language||PHA||Basque Language|
|PL||Languages and literatures of Eastern Asia, Africa, Oceania||PLA||Languages and Literatures of Africa|
|PLB||Languages and Literatures of Oceania|
|PM||Hyperborean, Indian, and artificial languages||PMA||Indian Languages|
|PQ||French literature - Italian literature - Spanish literature - Portuguese literature|
|PT||German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature|
|SH||Aquaculture. Fisheries. Angling|
|TC||Hydraulic engineering. Ocean engineering||TCA||Ocean Engineering|
|TD||Environmental technology. Sanitary engineering||TDA||Sanitary Engineering|
|TE||Highway engineering. Roads and pavements|
|TK||Electrical engineering. Electronics. Nuclear engineering||TKA||Electronics|
|TL||Motor vehicles. Aeronautics. Astronautics||TLA||Aeronautics|
|TN||Mining engineering. Metallurgy||TNA||Metallurgy|
|UG||Military engineering. Air forces||UGA||Air Forces|
|VK||Navigation. Merchant marine||VKA||Merchant Marine|
|VM||Naval architecture. Shipbuilding. Marine engineering|
|Z||Books (General). Writing. Paleography. Book industries and trade. Libraries. Bibliography||ZB||Books|
|ZE||Book Industries and Trade|
- These subclasses were later removed after further thought on the matter.
- See Class Z for further information.
Classes E & F: History of the Americas
The Library of Congress Classification system has two classes that cover the history of the Americas. Class E covers the United States while Class F covers the "local history" of the United States in addition to the history of British America, Canada, Dutch America, French America, Latin America and Spanish America.
At the time this classification system was implemented, Wikisource already had Portal:States of the United States with subportals for each state. Therefore, this was used as the equivalent of Class E with little change to the pre-existing portal. Class F was left to cover all other aspects of the history of the Americas, including any aspects of United States history that applies to more than one state.
New classes X & I
During implementation of the system, it was necessary to create two entirely new classes unique to Wikisource. Both use one of the letters omitted from the original classification system.
First, some pre-existing indices on Wikisource did not fit into the Library of congress Classification system. In order to accommodate these, the new Class X ("Wikisource") was created (X being a traditional wildcard term). This class is generally for Wikisource-specific classification. Subclasses are added to Class X as and when a situation arises where one is needed, starting with subclasses for WikiProjects and specific eras (ie. Ancient, Medieval etc)
Second, there was another pre-existing index, Texts by Country (and its subportals and indices), that did not easily fit any class in the system. These portals were national indices that covered each nation in general instead of the LCCS's more specialised areas (history of-, law of-, literature of- etc). Instead of dismantling or severely modifying a functioning index, this was declared to be a new Class I (I was the first unused letter in the alphabet). Each portal in this class serves as a hub for that nation, including works and/or linking to more specialised portals as necessary.
Class K: Law
Class K of the Library of Congress Classification system already makes extensive use of the third letter of the classification, which makes some adaptation (as described above) more difficult. Subclasses could not always be created by adding a letter; some were created by changing the existing third letter to the nearest unused letter. Others required more drastic alterations, changing the second letter of the classification for a batch of subjects and then selecting appropriate third letters from there.
The complete list of subclasses is extensive and can be found at: Portal:Law/Subclasses
|Original classification||New classification|
|K||Law (general)||KA||Law (general)|
|KD||Law of the United Kingdom and Ireland||KD||Law of the United Kingdom and Ireland|
|KDA||Law of the United Kingdom|
|England||KDB||Law of England|
|Wales||KDD||Law of Wales|
|KDC||Scotland||KDC||Law of Scotland|
|KDZ||America. North America||KDZ||Law of North America|
|Organization of American States (OAS)||KDV||Organization of American States|
|Bermuda||KDW||Law of Bermuda|
|Greenland||KDX||Law of Greenland|
|St. Pierre and Miquelon||KDZ||Law of St. Pierre and Miquelon|
|Added KDU History of Law in North America because other continents had similar classifications.|
|KEN||Newfoundland||KEJ||Law of Newfoundland|
|Northwest Territories||KEK||Law of the Northwest Territories|
|Nova Scotia||KEL||Law of Nova Scotia|
|KF||Law of the United States||KF||Law of the United States|
|Federal law. Common and collective state law||KFA||Federal Law of the United States|
|Individual states||KFB||State Law of the United States|
|KFA - KFW cover individual states, not enough available classifications|
|KFZ||Northwest Territory||KFY||Law of the North West Territory of the United States|
|Confederate States of America||KFZ||Law of the Confederate States of America|
|KGH||Panama||KGH||Law of Panama|
|Panama Canal Zone||KGI||Law of the Panama Canal Zone|
|KJ||Europe||KJ||Law of Europe|
|History of Law||KJB||History of Law in Europe|
|Germanic law||KJD||Germanic Law|
|KJP||Czechoslovakia||KJP||Law of the Czech Republic|
|KJQ||Law of Slovakia|
|KJT||Finland||KJT||Law of Finland|
|France||KJU||Law of France|
|KKK||Luxembourg||KKK||Law of Luxembourg|
|Malta||KKO||Law of Malta|
|KLH||Georgia (Republic)||KLH||Law of Georgia (country)|
|Lithuania, see KKJ||KLJ||Law of Lithuania|
|With KKJ and adjacent classifications in use, Lithuania remians here|
|KLP||Ukraine (1919-1991)||KLP||Law of Ukraine|
|Zakavkazskaia Sotsialisticheskaia Federativnaia Sovetskaia||KLO||Law of the Transcaucasian Socialist Federal Soviet Republic|
|KLR||Kazakhstan||KLR||Law of Kazakhstan|
|Khorezmskaia Sovetskaia Sotsialisticheskaia Respublika (to 1924)||KLU||Law of the Khorezm Socialist Soviet Republic|
|KM||Asia||KM||Law of Asia|
|Middle East. Southwest Asia||KMA||Law of the Middle East|
|KMF||Armenia (Republic)||KMF||Law of Armenia|
|Bahrain||KMB||Law of Bahrain|
|KMG||Gaza||KMG||Law of Palestine|
|KNT-KNU||[India] States, cities, etc.||KNT||State Law of India|
|KNU||Municipal Law of India|
|KPH||States of East and West Malaysia||KPH||Law of the States of East and West Malaysia|
|Maldives||KPI||Law of the Maldives|
|KQ||Africa||KQ||Law of Africa|
|History of law||KQA||History of Law in Africa|
|KQP||British Indian Ocean Territory||KQP||Law of the British Indian Ocean Territory|
|British Somaliland||KQQ||Law of British Somaliland|
|KSE||Equatorial Guinea||KSE||Law of Equatorial Guinea|
|Ifni||KSF||Law of Ifni|
|KSG||Italian East Africa||KSG||Law of Italian East Africa|
|Italian Somaliland||KSI||Law of Italian Somaliland|
|KSV||Mauritius||KSV||Law of Mauritius|
|Mayotte||KSM||Law of Mayotte|
|KTN||Spanish West Africa||KTN||Law of Spanish West Africa|
|Spanish Sahara||KTO||Law of Spanish Sahara|
|KU||Pacific Area||KU||Law of Oceania|
|Australia||KUA||Law of Australia|
|KUA-KUH||States and territories|
|KUB||Law of New South Wales|
|KUC||Law of the Northern Territory|
|KUD||Law of Queensland|
|KUE||Law of South Australia|
|KUF||Law of Tasmania|
|KUG||Law of Victoria|
|KUH||Law of Western Australia|
|KUI||Law of the Ashmore and Cartier Islands|
|KUJ||Law of Christmas Island|
|KUK||Law of the Cocos (Keeling) Islands|
|KUL||Law of the Coral Sea Islands Territory|
|Added KVA History of Law in Oceania because other continents had similar classifications.|
|KVH||American Samoa||KVH||Law of American Samoa|
|British New Guinea (Territory of Papua)||KVJ||Law of British New Guinea|
|KVP||French Polynesia||KVP||Law of French Polynesia|
|German New Guinea||KVO||Law of German New Guinea|
|KVS||Marshall Islands||KVS||Law of the Marshall Islands|
|Micronesia (Federated States)||KST||Law of Micronesia|
|Midway Islands||KSV||Law of the Midway Islands|
|KVU||Nauru||KVU||Law of Nauru|
|Netherlands New Guinea||KVX||Law of Netherlands New Guinea|
|KWL||Pitcairn Island||KWL||Law of Pitcairn Island|
|Solomon Islands||KWM||Law of the Solomon Islands|
|KWT||Wake Island||KWT||Law of Wake Island|
|Wallis and Futuna Islands||KWU||Law of Wallis and Futuna|
Some sections from the Law of the Caribbean in subclass KG were moved to the vacant subclass KC due to space limitations.
|Original classification||New classification|
|KGJ||Anguilla||KCA||Law of Anguilla|
|KGK||Aruba||KCB||Law of Aruba|
|KGL||Barbados||KCC||Law of Barbados|
|Bonaire||KCD||Law of Bonaire|
|British Leeward Islands||KCE||Law of the British Leeward Islands|
|British Virgin Islands||KCF||Law of the British Virgin Islands|
|British West Indies||KCG||Law of the British West Indies|
|British Windward Islands||KCH||Law of the British Windward Islands|
|KGP||Dominica||KCJ||Law of Dominica|
|KGR||Netherlands Antilles||KCK||Law of the Netherlands Antilles|
|Dutch Windward Islands||KCL||Law of the Dutch Windward Islands|
|French West Indies||KCM||Law of the French West Indies|
|Grenada||KCN||Law of Grenada|
|Guadeloupe||KCP||Law of Guadeloupe|
|KGT||Martinique||KCQ||Law of Martinique|
|Montserrat||KCR||Law of Montserrat|
|KGW||Saint Christopher (Saint Kitts), Nevis, and Anguilla||KCS||Law of Saint Kitts and Nevis|
|Saint Lucia||KCT||Law of Saint Lucia|
|Saint Vincent and the Grenadines||KCU||Law of Saint Vincent and the Grenadines|
|Sint Eustatius||KCV||Law of Sint Eustatius|
|Sint Maarten||KCW||Law of Sint Maarten|
Update: In the official LCCS, class Z is divided into just two subclass, subclass Z and subclass ZA. Subclass Z covers several different areas: Books (General). Writing. Paleography. Book industries and trade. Libraries. Bibliography. This needs to be split to be used on Wikisource. The first version of this split attempted to preserve the order as seen in the LCCS. The official subclass ZA prevented the second letter of the call number being used, so this was left blank and the third letter was used. For example, "Writing" was split to subclass Z_A. This was unwieldy and awkward, so the second version drops the attempt to preserve the order and moves all of the new subclasses to succeed subclass ZA. For example, "Writing" becomes subclass ZC. The following table shows both versions of this scheme:
|Subject area||1st version codes||2nd version codes|
|Book Industries and Trade||Z*C||ZE|
This section is an appendix to the essay. It may be helpful in understanding the adaptation and the classification system if all alterations to it are clearly logged.
- 18:16, 10 November 2010: Subclass GE changed from "Environmental Sciences" to "Environment" (to match existing portal)
- 19:25, 11 November 2010: Subclass HTB changed from "Races" to "Race studies" (to match existing portal)
- 23:27, 12 November 2010: Subclass BL changed from "Religions" to "Religion" (to match existing portal)
- 22:51, 29 November 2010: Subclass BPA changed from "Bahaism" to "Bahá'í Faith" (to match existing portal)
- 13:54, 18 January 2011: Subclass KNP changed from "Law of Taiwan" to "Law of the Republic of China" (following page move)
- 12:47, 15 March 2011: Subclass B* changed from "Philosophy (general)" to "General Philosophy" (improving the readability and clarity of the portal title)
- 12:51, 15 March 2011: Subclass D* changed from "History (general)" to "General History" (improving the readability and clarity of the portal title)
- 12:53, 15 March 2011: Subclass G* changed from "Geography (general)" to "Geography" (no disambiguation necessary in this case)
- 12:55, 15 March 2011: Subclass H* changed from "Social Sciences (general)" to "General Social Sciences" (improving the readability and clarity of the portal title)
- 12:57, 15 March 2011: Subclass JA changed from "Political Science (general)" to "General Political Science" (improving the readability and clarity of the portal title)
- 13:30, 15 March 2011: Subclass PN changed from "Literature (general)" to "General Literature" (improving the readability and clarity of the portal title)
- 17:28, 22 March 2011: Subclass KA changed from "Law (general)" to "General Law" (improving the readability and clarity of the portal title)
- 17:34, 22 March 2011: Subclass L* changed from "Education (general)" to "General Education" (improving the readability and clarity of the portal title)
- 17:37, 22 March 2011: Subclass M* changed from "Music (general)" to "General Music" (improving the readability and clarity of the portal title)
- 17:39, 22 March 2011: Subclass Q* changed from "Science (general)" to "General Science" (improving the readability and clarity of the portal title)
- 17:41, 22 March 2011: Subclass R* changed from "Medicine (general)" to "General Medicine" (improving the readability and clarity of the portal title)
- 17:42, 22 March 2011: Subclass S* changed from "Agriculture (general)" to "General Agriculture" (improving the readability and clarity of the portal title)
- 17:44, 22 March 2011: Subclass T* changed from "Technology (general)" to "General Technology" (improving the readability and clarity of the portal title)
- 17:46, 22 March 2011: Subclass U* changed from "Military Science (general)" to "General Military Science" (improving the readability and clarity of the portal title)
- 17:48, 22 March 2011: Subclass V* changed from "Naval Science (general)" to "General Naval Science" (improving the readability and clarity of the portal title)
- 18:29, 24 April 2011: Added subclasses to Class I (to add flexibility to the classification)
- 21:40, 6 February 2013: Removed subclass TEA ("Roads and pavements"), merged content back into subclass TE ("Highway engineering") as per the original LCCS. The difference in content between the two potential portals was slim and confusing; having both separate portals was redundant.
- 02:01, 9 February 2013: Removed subclass JVB ("International migration"), merged into subclass JVA ("Emigration and immigration"). Same reason as above.
- 20:32, 16 February 2013: Removed subclass UEA ("Armor"). Error in original interpretation; this is not distinct enough from UE ("Cavalry").
- 10:20, 18 February 2013: Merged GF (Human ecology) and GFA (Anthropogeography) into GF (Human geography)
- 17:02, 18 February 2013: Collapsing all of the PTx subclasses into one master "Germanic literature" subclass. Too many minor subclasses overloading Class P; on a later look at the list of languages, they were all in the Germanic family making this an obvious choice for merging them all into one simple subclass.
- 17:06, 18 February 2013: Collapsing all of the PQx subclasses into one master "Romanic literature" subclass. Following the example of the previous change.
- 16:51, 20 February 2013: Recoded Class Z. See Class Z above.
- 23:14, 7 August 2013: Collapsed subclasses VM (Shipbuilding), VMA (Naval architecture) and VMB (Marine engineering) back into one master subclass for VM: Shipbuilding and naval architecture
- 22:24, 20 August 2013: Removed subclass DKA ("History of the Soviet Union"), merged into subclass JK ("History of Russia"). As above, the two portals largely cover the same subject.
- 22:48, 20 August 2013: Removed subclass DLA ("History of Scandinavia"), merged into subclass DL ("History of Northern Europe") but kept the name "History of Scandinavia". The other countries of Northern Europe are covered by other subclasses, leaving on Scandinavia anyway; no need for two subclasses.