Bulk Data Task Force Reports Major Strides at October 2019 Meeting

The Bulk Data Task Force (BDTF) is essentially the justice league of legislative data. 

The task force convenes each quarter, bringing together the people in charge of managing Legislative Branch data—like the House Clerk, Secretary of the Senate, GPO, and Library of Congress—as well as outside stakeholders. Together the group works to make legislative data freely accessible to all.

The task force convened last week at the Legislative Data and Transparency Conference.

[youtube https://www.youtube.com/watch?v=hTZ0MPGPY74?start=3563]

Here are the highlights:

Office of the Clerk of the U.S. House of Representatives

The “Posey Print” Project—A.K.A. track changes for legislation—is well under way. 

Bill Compare Screen Grab
Source: Youtube

Whether you’re a Member of Congress or a staffer who drafts legislation and amendments, you likely want to see how your work will impact the laws that are already in place. 

This seems like a simple enough request, but figuring out the impact of new legislative text is no easy feat. This is primarily because, until recently, there’s no ‘track changes tool’ available for legislation. Creating a comparison between an amendment and base text was a totally manual process that required significant legislative and legal expertise.

Lawmakers moved to address this problem in the House Rules for the 115th Congress (Rule XXI Clause 12), which direct the House Clerk to build a legislation comparison tool. Congress has been building out this tool in phases, tacking 3 types of comparisons:

  1. Document to Document — i.e., a bill comparison
  2. How an Amendment Changes Current Law
  3. How an Amendment Would Change a Legislative Proposal

The Document-to-Document comparison is essential a side-by-side comparison of two documents to see what’s the same and what’s different. This tool was completed to meet a December 2017 deadline and can be found at the Congress-only website BillCompare.House.Gov.

An early version of the tool to track changes to current law has already been built for internal users to meet an early deadline, and requires legal or highly skilled staff to operate it.

The House Clerk’s Office is currently creating a more robust tool that shows how an amendment changes a legislative proposal, as well as how a bill changes current law and offers bill-to-bill comparisons. The Clerk’s office generously demonstrated the tool for the LDTC audience and we’re very excited about it. You should watch it here.

The House Clerk’s New Website 

The Clerk’s new website, ClerkPreview.House.Gov, is up in Beta version. The site consumes an API from XML sources to share information on floor activity, committee schedules, disclosure info, and more. The Clerk hopes to make the API publicly available in the future. 

Member Bioguides

Bioguide.Congress.gov, the site that publishes biographical information on all Members of Congress (past and present), will be available as data by the end of 2019. (Presumably the site will also migrate from HTTP to HTTPS). The Bioguide website is the source of unique IDs for every member of Congress.

Witness Disclosure Forms

Witnesses testifying before Congress are required to submit a disclosure form to the committee that indicates certain potential conflicts of interest. In recent years, those forms have been filled out by hand and submitted as scanned documents; it’s not uncommon for fields to be skipped altogether. Now, there is now a standardized witness disclosure form for committee use is available on Docs.House.Gov. We hope the House will take one more step and move to a webform, collecting all the submitted information into a central database that’s searchable by each field in the form.

Automated Co-Sponsorship of Legislation

If a Member wants to find co-sponsors for a bill, someone in their office, usually an intern or junior staffer, has to go around door to door for manual signatures from interested Members’ offices. On top of being time consuming and inefficient, the process poses potential authentication problems, as names can be misread or anyone (in theory) could sign the letter. To address this problem, the FY 2020 Leg. Branch Appropriations Bill Report encourages the House Clerk to develop an automated sign on tool. 

Government Publishing Office (GPO)

GPO Shared A “Release Roadmap” 

GPO plans to release resources like the Complete US Code (by 2019), Congressional Bills, Public Laws, Statutes at Large for the 117th Congress, and House and Senate Calendars for 2020 via XPub. If you’re not familiar, XPub is the update to Microcomp (a tool for writing, printing, and publishing documents) that GPO had historically uses.

Historic Statutes At Large In USLM Format

GPO is piloting a project to see how difficult and expensive it would be to convert pre-2003 Statutes at Large into USLM. The Statutes at Large are all the laws enacted by Congress; if they are available as structured data, it becomes possible to use technology to instantaneously show how every bill has amended the law.

GPO will first convert a test group of digitized Statutes at Large into USLM XML. The office says this will be a long term project that goes beyond FY 2020. The assessment was originally requested in the FY 19 Leg. Branch Appropriations Bill Report.

Legislative Information As Data

GPO has made an API for bill status available to the public. Additionally, the E-Code of Federal Regulations (eCFR)—an up to date version of the CFR—is available as bulk data.  Learn more at https://www.govinfo.gov/features/api

Office of the Secretary of the Senate

Senate Directory As Data

One of the most frequent requests of the Senate is data for mailing labels. To address this problem, the Senate is publishing  a contact list on Senate.gov. The Library has asked the Secretary to extend that information and republish it in JSON format. That project is underway and will be available both across the Capitol Hill extranet and to the public. The Secretary’s office is also considering sharing committee member information, committee schedules, and roll call votes in JSON. 

Library of Congress (LOC)

New & Improved Committee Calendar 

The committee calendar schedule, which can be found on Congress.gov, pulls from the House Committee Repository as well as hearings and meetings on Senate.gov to aggregate all House and Senate committee meetings and hearings for a given week in one place. Here are the newest features:

  • Email notifications when your favorite committees have added events to the Congress.gov calendar. This is possible because LOC added meeting event data to the search function. All you need to do is pick the committees you’re interested in and set an alert.
  • Links to relevant legislation on event and meeting pages.
  • Search filters already available for the wider Congress.gov platform are now available for sorting the committee calendar.
  • An expanded calendar view that allows for viewing an entire week’s events at once. 

CRS Report Publication

We asked whether CRS will continue to post the backlog of public reports online and CRS indicated in a written response that they believe they’ve fulfilled the statutory requirement. For reference, as of October 18th there were 7,301 on crsreports.congress.gov and 15,145 on EveryCRSReport.com. CRS did not answer whether it will publish any of its historical reports currently unavailable on CRS’s internal website but available for a fee from third-party data providers.The office also does not have plans to publish the reports as data despite publishing the reports as HTML internally for Congress. Publication as HTML would make the information significantly more useful.

Public Law as ULSM

The Library also shared that they plan to add public law text in USLM by the end of 2019.

Related Reading & Additional Resources

Bulk Data Repository GovInfo.gov/bulkdata
Clipping Live.House.Gov
Congress Congress.gov
Digital Design DesignSystem.Digital.Gov
Gov Info GovInfo.Gov 
Gov Info API Api.GovInfo.Gov
GPO Github.com/USGPO
GPO Innovation Hub Github.com/usgpo/innovation
House of Representatives Repository  Docs.House.Gov
House of Representatives Legacy Site (1998) XML.House.Gov
Member Bioguides Bioguide.Congress.gov
Office of the House Clerk Clerk.House.Gov & ClerkPreview.House.Gov
Senate Senate.gov
USLM Statutes Github.com/USGPO/USLM & GovInfo.gov/app/collection/comps
Tech Timeline xml.house.gov/resources/TechTimeline.htm

— Written by Amelia Strauss