Building NextGen: What’s Happening Behind the Scenes

Articles

While developing its new NextGen database, JRI-Poland has been hard at work behind the scenes. While we are assembling our new database, we can explain a little about what is going on that is not quite ready for “prime time”, but it getting close. Each of our more than 2000 files is being individually reviewed and standardized by a small team of highly motivated volunteers. We are involved in a three-pronged process:

  1. Consolidate our Spreadsheets and Assess the Quality & Completeness of Data
  2. Continue to Index & Extract New Datasets and Missing Registers from Existing Data
  3. Clean-up the Data, Improve It and Optimize the Format

We are Consolidating Spreadsheets, Assessing Data Quality & Completeness

We have put together a team of dedicated volunteers who are studiously preparing the existing data — both in Legacy and offline — for import into the NextGen system. In a way, our handful of experienced volunteers are serving as inspectors of the data to determine the quality and completeness of each collection. 

We are compiling a huge “data warehouse” of Jewish given names to help in the assignment of a male or female designation to our more than 6 million records, including millions of records where biological sex was previously unassigned or incorrectly recorded. This means that we will be prepared for much more accurate searching. For example, in our Legacy system, if you search for people named “Chaim” it will generally also return people named “Chana” which clearly are not likely to be people of the same sex. We are working to make sure that we assign sex to as many records as possible, and our new Given Name data warehouse is leading the way.

“Searchable Roles”

In our legacy system, search results were primarily limited to the main subjects of a record and their parents. Our new NextGen system expands this visibility significantly. Every person mentioned in a document is now a searchable individual, providing a much richer picture of your ancestors’ social circles and migration patterns.

Now you can search for and find:

  • Direct Family: The primary subject, their parents, spouses, and surviving children.
  • Extended Family: Grandparents and other relatives mentioned in the text.
  • Community Members: Witnesses, rabbis, mohels, midwives, or even boarders residing in the same household.

This means a single search could reveal a relative’s birth record in one town and their appearance as a witness in a death record 30 years later in a different location.

 

Continuing to Extract New Data, especially Missing Registers from Existing Collections

Our extractors — both professional and volunteer — have been busy.  Over the past year, we have extracted or indexed more than 150,000 record entries — not only birth, marriage, and death records, but marriage alegata, books of residents, and notarial records. Many of these non-vital records collections are for towns that have little or no surviving Jewish vital records. 

For example, there are no surviving vital records for Dynów, a town in the Przemyśl area, but there is a 1930’s set of Books of Residents that we have improved, converted and put in the queue to import into our new database. This yielded over 320 identifiable families including some with information for four generations of relatives and how they are related to each other. We anticipate that this process has surfaced more than 9,000 distinct individuals associated with this town that had no surviving vital records!

Below is a sample spreadsheet resulting from this cleaning and optimization process.

If you look at the family depicted on the lines beginning with row 734, you will see a head of household, his wife is in fact his second wife, and by reviewing the record, we determined that one of the children living in the household was the child of the wife’s first marriage, while the second child in the household was the child of the head of household’s first marriage. Even the name of the second wife’s first husband is provided in the record yielding a much richer description of the family than was previously available from the descriptions provided in the original Polish. This record is also interesting as the town was part of Galicia, where many Jews had religious marriages that were not recognized civilly, and so following the surnames of the people in the record is more complicated than for towns in Congress Poland.

Clean-up the Data and Improve it & Optimize the Format

Third, the combination of our “critical mass” of data, a term used often by our late founder Stanley Diamond (ז״ל), and our analytical work is producing fascinating discoveries of items that otherwise might be lost to history. In Congress Poland, for example, using our 19th century data allows us to identify, and deduce surnames for, individuals who appear in the 1808-1825 mixed-denomination records without surnames and with only patronymics. This has enabled some researchers to trace ancestral lines to the beginning of the 1700s, and, once in a blue moon, to even the late 1600s.  Marriage alegata often provide copies of birth and death records that no longer survive in their original registers. Many individuals included in the Books of Residents never had recorded birth records in the first place; our experts advise that these Books of Residents entries are often the only proof of the person’s existence in the town.   

We know that some of you are frustrated that we have not added data to the Legacy system in some time.  Yes, this is by design since our new system is vastly different than our Legacy Database. Continuing to add to our Legacy System was creating great duplication of effort and we had to restrict new development to our new format which is not compatible with the Legacy system format. 

Please know that we have been hard at work behind the curtain and we have been concentrating on building our new database with the goal of providing the best experience for your research.  We have many active projects and a slew of data pouring in and waiting to be reviewed and curated.  In the meantime, always reach out to the town leader or to JRI-Poland leadership to learn more about the projects and data development for your towns of interest.  We do have much data offline in our new format, but it is available to our volunteers as they help researchers like you.

Thank you to our 160+ volunteers who contribute their hard work and their time. And thank you everyone for your support. We look forward to leaps and bounds of progress in 2026.

Howard Zakai

Vice-President, JRI-Poland