Looks Right, Isn’t Right: The Problem with AI-Generated Genealogy Citations

Why use a dedicated purpose-built citation generator for your genealogical sources rather than one of the many freely available AI tools (such as ChatGPT, Claude, Microsoft Copilot, or Google Gemini)? We’ve done some research into the current capabilities of these AI tools in this area, and the accuracy/correctness of the citations they produce.

There’s no doubt that Artificial Intelligence is an incredible technology that we now use in many ways in our everyday lives, and it is advancing at a rapid rate. It can save us huge amounts of time and even help us get started on certain tasks by removing those mental blocks when staring at a blank screen.

Artificial intelligence can be a helpful assistant in genealogy, but it should not yet be relied upon to produce fully accurate source citations. While AI tools can generate citations that look correct, the way they work introduces several risks that genealogists need to understand. In order to fully appreciate how these problems may occur, we should first look at how AI works, specifically when it comes to generating text (like a genealogical source citation).

AI systems like ChatGPT are built on large language models (LLMs), which are a type of neural network trained on vast collections of text. At their core, these systems use a mathematical architecture known as a transformer . Transformers process language by examining relationships between words simultaneously, using attention mechanisms to decide which parts of the input matter most. An encoder converts your input—names, dates, places—into numerical representations, and a decoder generates text from those representations. A neural network learns patterns by processing huge numbers of examples, making predictions, comparing them to the correct answers, and adjusting its internal connections (between artificial “neurons”) to reduce errors; it gradually captures increasingly complex relationships in the data—such as grammar, style, and even specialised formats like citation styles —allowing it to generalise those patterns when generating new text.

However, this process is fundamentally based on probabilities, rather than factual. The model does not retrieve or verify a specific citation template. Instead, it predicts what a citation should look like based on learned patterns. If genealogical citations often include archive names, registration numbers, or formal phrasing, the model will reproduce those features—even when some details were never provided. This is why AI can produce citations that appear authoritative while containing subtle errors or entirely fabricated elements.

An additional complication is that AI-generated results are not always consistent. Different AI models are trained on different datasets, use different architectures, and may be fine-tuned for different purposes. As a result, two models given the same genealogical details may produce noticeably different citations—varying in wording, structure, completeness, or even in the inclusion of incorrect details.

Even within the same model, outputs can vary from one attempt to another. This is because text generation involves an element of controlled randomness. At each step, the model selects from a range of likely next words rather than always choosing a single fixed option. Small differences in this selection process—sometimes influenced by system settings, updates, or even subtle differences in the prompt—can lead to variations in the final citation. Over time, updates to the model itself may also change how it formats or prioritizes information, meaning that the same request made months apart could yield different results.

These characteristics introduce several practical problems. AI may “hallucinate” missing components, such as repository names or certificate descriptors. It may inconsistently apply formatting rules or blend conventions from different citation systems. Because it cannot access or evaluate the original source, it cannot distinguish between an original record and a derivative index—an essential distinction in genealogy. The variability between outputs further complicates matters, making it difficult to rely on AI for consistent citation standards across a body of work.

Finally, because the AI system may lack access to up to date “restricted access” styles guides (such as for some academic courses), it cannot definitively generate the correct citation and thus will have to use “guess-work” (although some AI tools are good enough to tell you that this is the case).

For these reasons, AI-generated citations should be treated as drafts or starting points rather than authoritative references. Genealogists should always verify details against the original record and apply their own knowledge of citation standards. Used carefully, AI can save time—but accuracy and consistency still depend on informed human judgment.

Or, better still, use a dedicated purpose-built genealogy citation generator, like Cite-Builder, that has hundreds of specific templates, each of which has been researched, built and verified for its accuracy by a human being!

Let’s summarise the problems with relying on AI-generated citations:


Not convinced yet? Then look at the following example. We gave Google AI (Gemini), ChatGPT, Claude, and Copilot the same prompt and input data (when prompted for it by the AI) to generate a citation for a certified copy of a marriage certificate from New South Wales. We started deliberately vague to see how the AI would interpret our prompt and ask us for additional information. We also built the citation using Cite-Builder for comparison. The Cite-Builder template used is based directly on guidance and examples in the Evidence Explained 4 th Editionphysical book and built by our team-member with nearly 50 experience years of genealogical research, so we can be reasonably confident that the result from Cite-Builder is correct.

Prompt: “create a citation for a certified copy of a marriage certificate. Prompt me for what information is needed.

Information supplied when prompted:

Husband William Brockbank
Wife Merab Annesley
Date 22 Nov 1899
Registration District/ Place of Marriage * Lithgow
Registration Number 8388/1899
Citation Style Evidence Explained

*Note: Some AI systems asked for the Registration District, some asked for the Place of Marriage, and some asked for both (as they may not be the same). It all cases, we entered “Lithgow” for this example.

Cite-Builder

Using template: Civil Births, Marriages, Deaths & Divorces > Australia > Certified Copy of Birth, Marriage & Death Certificates > Marriages > NSW Marriage Certificate

1st Footnote:

New South Wales, marriage certificate (certified copy) for William Brockbank and Merab Annesley, married 22 November 1899, registration number 8388/1899, Lithgow district; NSW Registry of Births Deaths & Marriages, Chippendale.

Subsequent Footnote:

New South Wales, marriage certificate (certified copy), William Brockbank and Merab Annesley, Registration Number 8388/1899.

Bibliography:

New South Wales. Marriage Certificates. NSW Registry of Births Deaths & Marriages, Chippendale.

Google AI

1st Footnote:

New South Wales, marriage certificate (certified copy) no. 8388 (1899), William Brockbank and Merab Annesley, married 22 November 1899; NSW Registry of Births, Deaths & Marriages, Chippendale; in possession of [Your Name], [Current Year].

Subsequent Footnote:

New South Wales, marriage certificate (certified copy) no. 8388 (1899), Brockbank-Annesley.

Bibliography:

New South Wales. Registry of Births, Deaths & Marriages. Marriage Certificates. NSW Registry of Births, Deaths & Marriages, Chippendale, Australia.

Differences between Google AI and Cite-Builder results:

Google AI

ChatGPT

1st Footnote:

William Brockbank and Merab Annesley, marriage certificate, 22 November 1899, Lithgow, New South Wales, registration no. 8388/1899; certified copy, New South Wales Registry of Births, Deaths and Marriages.

Subsequent Footnote:

Brockbank–Annesley marriage certificate, 22 November 1899, reg. no. 8388/1899.

Bibliography:

New South Wales Registry of Births, Deaths and Marriages. Marriage certificate for William Brockbank and Merab Annesley, 22 November 1899, Lithgow, New South Wales . Registration no. 8388/1899. Certified copy.

Differences between ChatGPT and Cite-Builder results:

ChatGPT

Claude

1st Footnote:

New South Wales Registry of Births, Deaths and Marriages, marriage certificate no. 8388/1899, William Brockbank and Merab Annesley, 22 November 1899, Lithgow; certified copy issued [date issued], privately held by [Your Name, Your City, State/Country, year you obtained it].

Subsequent Footnote:

NSW Registry of BDM, marriage cert. no. 8388/1899, Brockbank–Annesley (1899).

Bibliography:

New South Wales Registry of Births, Deaths and Marriages. Marriage Certificates. Certified copy of no. 8388/1899, Brockbank–Annesley, 1899, privately held by [Your Name, Your Location].

Differences between Claude and Cite-Builder results:

Claude

Copilot

1st Footnote:

New South Wales Registry of Births, Deaths and Marriages, Marriage Certificate , certified copy for William Brockbank and Merab Annesley, married 22 November 1899, Lithgow, New South Wales; registration no. 8388/1899.

Subsequent Footnote:

NSW BDM, Marriage Certificate, Brockbank–Annesley, 8388/1899.

Bibliography:

New South Wales Registry of Births, Deaths and Marriages. Marriage Certificate , Brockbank–Annesley, 22 November 1899, Lithgow. Registration no. 8388/1899.

Differences between Copilot and Cite-Builder results:

Copilot

Overall Comparisons


This comparison highlights how each tool subtly alters structure, detail, or intent—even when working from the same information.

Full Disclosure: ChatGPT was used in the creation of this article to compare the different outputs and summarise them. As this is a pure text comparison of a small number of inputs, we feel this is a legitimate use of the tool!