Skip to content

Project files, scripts, configurations, and workflow publications for the Congressional Portal Project

Notifications You must be signed in to change notification settings

prys0000/congressional-portal-project

Repository files navigation

congressional-portal-project

The congressional-portal-project provides a repository for workflows, methodologies, instructional materials, controlled vocabularies, and more. This repository was created to house large-scale project efficiency methodologies, and automated workflows, and to document strategies throughout the project timeline. The project scope focuses on materials relating to the American Congress from the Carl Albert Research and Studies Center Archives. Specific content information can be reviewed below.

january 2024 updates

The new American Congress Digital Archives Portal link has been established. The Center has submitted 2,139 individual items, comprising 16,540 pages of text, including both typed and handwritten content that can be searched, and 100 video files. These materials are currently under review by the project manager.

This first submission for review includes subjects related to American Indian Sovereignty and Oklahoma Congressional Representation.

Collection Dates
Carl Albert Collection 1947-1971
Dewey F. Bartlett Collection 1973-1978
John Happy Camp Collection 1969-1973
Fred R. Harris Collection 1961-1976
Robert L. Owen Collection 1913-1947
Elmer Thomas Collection 1923-1950
James R. Jones Collection 1973-1986
Julian P. Kanter Collection 1984-1988

december 2023 updates

We are pleased to announce the successful rollout of our latest application and model, which has significantly enhanced our capabilities in text interpretation, metadata collection, sentiment analysis, tone recognition, and other forms of data extraction from archival text and currently testing photograph collections.

Previous methodology: Students working 10-15 hours per week were able to scan and collect metadata for an average of 55 documents per week.

Document Processing: Our new model has processed and collected data from a remarkable 12,537 pages of archival materials, comprising typewritten and handwritten content.

Error Resolution: We identified and addressed 150 errors related to misspelled months, such as "Vay" instead of "May" and "Mary" instead of "March." We have promptly resolved this issue, ensuring accurate data handling moving forward.

folders:

  • documentation-applications-list contains project worksheets, collection indexes, training models, and controlled vocabularies.
  • workflows contains packaged workflows with either executable portable applications or consolidated/compiled scripts for OCRing, assigning controlled metadata, extracting specific text from OCR text.
  • depreciated-packages contains outdated scripts and notes that have been replaced by newer versions.

tasks simplified

  • handwritten documents - the focus is on transcribing hard to read text, and collecting required metadata, quickly and efficiently
  • conbination of handwritten and typewritten documents - developing automated processes to read both handwritten and typewritten documents
  • face recognition - bulk processing recognition of faces in images with trained models, scripts, and lists
  • transcribing A/V - bulk transcription and extraction of audio/visual text greatly enhances workflow
  • transcribing A/V - topic modeling - bulk transcription of audio/visual materials combined with adcanced topic identification increases productivity, accuracy, and efficiency
  • migration and finding aid tranforms - bulk processing for finding aid transforms involve creating or updating descriptive metadata and finding aids for archival collections
  • quality control and error checking - reviews and verifys the accuracy, completeness, and consistency of digitized and transcribed materials resulting from the above tasks

content overview

The Center will concentrate on content related to four curated collections, encompassing over 75,677 individual items from the CAC Archives. Additional digital files are available on our Digital Archives Platform.

Collection Topical/Whole Topics Subtopics Significance Extent Formats
Indian Self-Determination topical Congress as policy-maker, Leaders and parties, Congress and the courts Types of decisions, Committee leadership, Policy making in committee, Constituent communications, Demography Congressional offices hold correspondence showcasing intricate strategies used by tribal entities and congressional members. Collections highlight policy actions and issues affecting tribes across various states. While much of the relevant legislation has a national purview, our project isn't solely focused on Oklahoma. 23 collections PDF/A, PDF/E or PDF with original file, TIFF
Robert L. Owen Collection collection-whole Congress as policy-maker, Leaders and parties, Congress and the courts Cultural norms Robert L. Owen was a member of the Cherokee Nation and represented the Five Civilized Tribes as a federal Indian agent before entering politics as a Progressive Democrat. Owen is one of only four Native Americans serving in the United States Senate. 199 items PDF/A, PDF/E or PDF with original file, TIFF
United States House of Representatives Offices Campaign Ads collection-whole Leaders and parties, Elections, Congress and interest groups, Congress history - general Leadership activities, Determinants of voting, Tactics, Electoral outcomes, Impact of technology Through the collection of television and radio political advertisements, film, social media, and other sources, the archive seeks to expand the knowledge and understanding of political communications, and the growth and changes in this field across the most significant and prolific era in world history. 24,678 items Motion JPEG 2000, MOV, AVI
Carl Albert Photograph Collection collection-whole Leaders and parties Party leadership files Exclusive to the Carl Albert Center Archives is the vast personal collective of Albert’s photograph collection ranging the entirety of his career. 11,000 items TIFF

acknowledgements

Carl Albert Congressional Research and Studies Center Archives

See acknowledgements for student staff and collaborators

See collaborative patners for project partners.

authors

JA Pryse - Senior Archivist, III

license

See LICENSE for more information.


About

Project files, scripts, configurations, and workflow publications for the Congressional Portal Project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages