The congressional-portal-project provides a repository for workflows, methodologies, instructional materials, controlled vocabularies, and more. This repository was created to house large-scale project efficiency methodologies, and automated workflows, and to document strategies throughout the project timeline. The project scope focuses on materials relating to the American Congress from the Carl Albert Research and Studies Center Archives. Specific content information can be reviewed below.
The new American Congress Digital Archives Portal link has been established. The Center has submitted 2,139 individual items, comprising 16,540 pages of text, including both typed and handwritten content that can be searched, and 100 video files. These materials are currently under review by the project manager.
This first submission for review includes subjects related to American Indian Sovereignty and Oklahoma Congressional Representation.
Collection | Dates |
---|---|
Carl Albert Collection | 1947-1971 |
Dewey F. Bartlett Collection | 1973-1978 |
John Happy Camp Collection | 1969-1973 |
Fred R. Harris Collection | 1961-1976 |
Robert L. Owen Collection | 1913-1947 |
Elmer Thomas Collection | 1923-1950 |
James R. Jones Collection | 1973-1986 |
Julian P. Kanter Collection | 1984-1988 |
We are pleased to announce the successful rollout of our latest application and model, which has significantly enhanced our capabilities in text interpretation, metadata collection, sentiment analysis, tone recognition, and other forms of data extraction from archival text and currently testing photograph collections.
Previous methodology: Students working 10-15 hours per week were able to scan and collect metadata for an average of 55 documents per week.
Document Processing: Our new model has processed and collected data from a remarkable 12,537 pages of archival materials, comprising typewritten and handwritten content.
Error Resolution: We identified and addressed 150 errors related to misspelled months, such as "Vay" instead of "May" and "Mary" instead of "March." We have promptly resolved this issue, ensuring accurate data handling moving forward.
- documentation-applications-list contains project worksheets, collection indexes, training models, and controlled vocabularies.
- workflows contains packaged workflows with either executable portable applications or consolidated/compiled scripts for OCRing, assigning controlled metadata, extracting specific text from OCR text.
- depreciated-packages contains outdated scripts and notes that have been replaced by newer versions.
- handwritten documents - the focus is on transcribing hard to read text, and collecting required metadata, quickly and efficiently
- conbination of handwritten and typewritten documents - developing automated processes to read both handwritten and typewritten documents
- face recognition - bulk processing recognition of faces in images with trained models, scripts, and lists
- transcribing A/V - bulk transcription and extraction of audio/visual text greatly enhances workflow
- transcribing A/V - topic modeling - bulk transcription of audio/visual materials combined with adcanced topic identification increases productivity, accuracy, and efficiency
- migration and finding aid tranforms - bulk processing for finding aid transforms involve creating or updating descriptive metadata and finding aids for archival collections
- quality control and error checking - reviews and verifys the accuracy, completeness, and consistency of digitized and transcribed materials resulting from the above tasks
The Center will concentrate on content related to four curated collections, encompassing over 75,677 individual items from the CAC Archives. Additional digital files are available on our Digital Archives Platform.
Collection | Topical/Whole | Topics | Subtopics | Significance | Extent | Formats |
---|---|---|---|---|---|---|
Indian Self-Determination | topical | Congress as policy-maker, Leaders and parties, Congress and the courts | Types of decisions, Committee leadership, Policy making in committee, Constituent communications, Demography | Congressional offices hold correspondence showcasing intricate strategies used by tribal entities and congressional members. Collections highlight policy actions and issues affecting tribes across various states. While much of the relevant legislation has a national purview, our project isn't solely focused on Oklahoma. | 23 collections | PDF/A, PDF/E or PDF with original file, TIFF |
Robert L. Owen Collection | collection-whole | Congress as policy-maker, Leaders and parties, Congress and the courts | Cultural norms | Robert L. Owen was a member of the Cherokee Nation and represented the Five Civilized Tribes as a federal Indian agent before entering politics as a Progressive Democrat. Owen is one of only four Native Americans serving in the United States Senate. | 199 items | PDF/A, PDF/E or PDF with original file, TIFF |
United States House of Representatives Offices Campaign Ads | collection-whole | Leaders and parties, Elections, Congress and interest groups, Congress history - general | Leadership activities, Determinants of voting, Tactics, Electoral outcomes, Impact of technology | Through the collection of television and radio political advertisements, film, social media, and other sources, the archive seeks to expand the knowledge and understanding of political communications, and the growth and changes in this field across the most significant and prolific era in world history. | 24,678 items | Motion JPEG 2000, MOV, AVI |
Carl Albert Photograph Collection | collection-whole | Leaders and parties | Party leadership files | Exclusive to the Carl Albert Center Archives is the vast personal collective of Albert’s photograph collection ranging the entirety of his career. | 11,000 items | TIFF |
Carl Albert Congressional Research and Studies Center Archives
See acknowledgements for student staff and collaborators
See collaborative patners for project partners.
JA Pryse - Senior Archivist, III
See LICENSE for more information.