Jump to content
The Education Forum

The use of Artificial Intelligence in JFKA research


Recommended Posts

Firstly, apologies if this topic has already been covered off elsewhere, but I'm interested to know if any JFKA researchers have used/or are using AI as part of their work, or how valuable researchers believe AI could be in this quest.

If it is being used - how so? And what specific areas of JFKA research do people think AI could be a useful tool for?

Given the sheer volume of documents, research, books, audio, still and moving images, and other resources pertaining to the assassination, my view is AI could be very useful indeed. Possibly even a game changer.

I see potential particularly around thinks like image enhancement, facial recognition and also modelling the likelihood of various scenarious.

Would be interested to know the thoughts of others.

Link to comment
Share on other sites

  • Replies 51
  • Created
  • Last Reply

Top Posters In This Topic

The day before Superbowl 58, someone asked an AI chatbot for a prediction. The AI said it couldn't predict the game because, according to the AI, it had already been played, and that San Francisco won. It even gave the final score.

I'm one of those people who thinks a wrong answer is worse than no answer, and it certainly seems that if AI doesn't know the correct answer to a question, it does it's level best to give you an answer anyway, regardless of accuracy - or even possibility.

Edited by Denny Zartman
Link to comment
Share on other sites

7 minutes ago, Denny Zartman said:

The day before Superbowl 58, someone asked an AI chatbot for a prediction. The AI said it couldn't predict the game because, according to the AI, it had already been played, and that San Francisco won. It even gave the final score.

I'm one of those people who thinks a wrong answer is worse than no answer, and it certainly seems that if AI doesn't know the correct answer to a question, it does it's level best to give you an answer anyway, regardless of accuracy - or even possibility.

There are various problems with AI as it stands. Firstly, we need to remember it is still very early in it's evolution. It's not a finished article and that's the point - it never will be. It will continue to get ever more sophisticated and will do so exponentially. I think it's potential is quite mindblowing. 

The other problem is, lay people like ourselves have very little understanding of AI and I would go as far to say many so-called 'experts' have a limited grasp of it and its potential.

The fact of the matter is, AI is already being used by intelligence agencies and law enforcement. It's being used as a tool to solve cold cases.

This is the biggest cold/unsolved case of the lot. 

Link to comment
Share on other sites

2 minutes ago, Ben Green said:

There are various problems with AI as it stands. Firstly, we need to remember it is still very early in it's evolution. It's not a finished article and that's the point - it never will be. It will continue to get ever more sophisticated and will do so exponentially. I think it's potential is quite mindblowing. 

The other problem is, lay people like ourselves have very little understanding of AI and I would go as far to say many so-called 'experts' have a limited grasp of it and its potential.

The fact of the matter is, AI is already being used by intelligence agencies and law enforcement. It's being used as a tool to solve cold cases.

This is the biggest cold/unsolved case of the lot. 

It's clear you want to believe in the potential of AI. I can't stop you. It will inevitably be used in research anyway.

But the question asked on Saturday "Has Sunday's game been played yet?" is not a difficult one at all. Yet, the AI said three things: The game had been played already, the 49ers won it, and the exact score of the game. All three wrong. Not just wrong, but coming up with imaginary facts like the final score to support the incorrect answer.

What if someone asks an AI to calculate the trajectory of a bullet and since it doesn't know, it just makes up an answer, including impressive extraneous information? Then we'll have people like us arguing over it for years because we assume the AI is better at complex calculations than we are, so it must likely be right.

As I see it, AI wants to please. If it doesn't have an answer, it will try to give you one anyway. If AI is asked to scan a photo of the fence line for human looking figures, how do we know for certain it won't make them up, since it knows that's what we're looking for?

Link to comment
Share on other sites

2 hours ago, Ben Green said:

Firstly, apologies if this topic has already been covered off elsewhere, but I'm interested to know if any JFKA researchers have used/or are using AI as part of their work, or how valuable researchers believe AI could be in this quest.

If it is being used - how so? And what specific areas of JFKA research do people think AI could be a useful tool for?

Given the sheer volume of documents, research, books, audio, still and moving images, and other resources pertaining to the assassination, my view is AI could be very useful indeed. Possibly even a game changer.

I see potential particularly around thinks like image enhancement, facial recognition and also modelling the likelihood of various scenarious.

Would be interested to know the thoughts of others.

I work with AI every day, and I think the easiest to develop and most immediately useful application for JFK research would be an LLM enhanced search engine. Right now on MFF for example, you can only search for exact keyword matches, which is great, but it can be kind of annoying finding related documents if they don’t have matching text. The RIF search is great too, and incredibly useful, but you kind of have to know what you’re looking for to really leverage it. 

If you converted all the OCR text to vector embeddings, attached the RIF sheet metadata, kept all the document set filtering capabilities, then slapped an LLM retrieval bot on top, you could search through the collection with semantic queries like: 

“Give me every CIA document dated from 11/22 to 11/25/63 that discusses Oswald’s mode of travel in and out of Mexico City” 

or 

“Find me every mention of the entrance wound in WC and HSCA testimony, sort them by date, and respond in list format with date, witness, interviewing counsel, # of commissioners present at the hearing, and the actual exchange or exchanges discussing the wound.” 

The bot can be easily configured to return links to the source documents. The stuff Denny mentioned can be completely eliminated through prompt engineering and/or fine tuning. 

You could also get the LLM’s interpretation of documents, but one of the problems there is you can lose a lot of meaningful information converting pdfs to plain text like document structure, marginalia, etc. That is fixable, but it’d be a daunting and very expensive task to do it for the entire collection. 

If MFF exposed all their data including the pdf links and OCR text through APIs, I bet a solid RAG bot could be developed in a week or less. The biggest question I think would be where to host the embeddings. I’m not sure what MFF uses but a lot of db services now offer native vector search capabilities. There are also dedicated vector databases like Pinecone, etc. Ideally though all the data could be obtained in one query, so something like MongoDB Vector Search would be ideal, but it wouldn’t be cheap to host 1M+ documents, embeddings, etc. 

Denny’s concerns are not really valid. The behavior of an LLM is highly configurable. Literally all you have to do to prevent simple hallucinations in most cases is add a system prompt telling the LLM to not respond if it doesn’t know the answer. In RAG bots it’s even easier since you can force a function call to query your database. 

The main limitation of RAG is you can only inject so much text into a prompt without exceeding token limits, plus the more you inject, the more you pay. The next level up would be training/fine tuning an LLM on the entire collection, but that’d be a hell of a lot more complicated to develop a deploy. Preparing the training/testing data would be a massive pain in the ass. 

A good test case I think would be setting up an RAG bot on a smaller dataset, like WC testimony only, or something like that. Heck you could probably just grab the text off the McAdams site. For best results though I’d probably convert each hearing to JSON or something, and include embedded metadata, etc. That way you could ask the AI more focused questions, and format your prompt injection to reflect detailed info on each question/answer pair. 

Link to comment
Share on other sites

Another thing I just thought of that could be interesting is you could train an image classification model on thousands of skull photos, autopsy photos, etc. and try to get it to orient the mystery photo. The accuracy would be totally dependent on the quality of your training data, and I’m not sure how you’d collect all that cause you’d probably need a ton of different angles, zoom lengths, etc. but in theory it should be possible. 

Link to comment
Share on other sites

24 minutes ago, Tom Gram said:

I work with AI every day, and I think the easiest to develop and most immediately useful application for JFK research would be an LLM enhanced search engine. Right now on MFF for example, you can only search for exact keyword matches, which is great, but it can be kind of annoying finding related documents if they don’t have matching text. The RIF search is great too, and incredibly useful, but you kind of have to know what you’re looking for to really leverage it. 

If you converted all the OCR text to vector embeddings, attached the RIF sheet metadata, kept all the document set filtering capabilities, then slapped an LLM retrieval bot on top, you could search through the collection with semantic queries like: 

“Give me every CIA document dated from 11/22 to 11/25/63 that discusses Oswald’s mode of travel in and out of Mexico City” 

or 

“Find me every mention of the entrance wound in WC and HSCA testimony, sort them by date, and respond in list format with date, witness, interviewing counsel, # of commissioners present at the hearing, and the actual exchange or exchanges discussing the wound.” 

The bot can be easily configured to return links to the source documents. The stuff Denny mentioned can be completely eliminated through prompt engineering and/or fine tuning. 

You could also get the LLM’s interpretation of documents, but one of the problems there is you can lose a lot of meaningful information converting pdfs to plain text like document structure, marginalia, etc. That is fixable, but it’d be a daunting and very expensive task to do it for the entire collection. 

If MFF exposed all their data including the pdf links and OCR text through APIs, I bet a solid RAG bot could be developed in a week or less. The biggest question I think would be where to host the embeddings. I’m not sure what MFF uses but a lot of db services now offer native vector search capabilities. There are also dedicated vector databases like Pinecone, etc. Ideally though all the data could be obtained in one query, so something like MongoDB Vector Search would be ideal, but it wouldn’t be cheap to host 1M+ documents, embeddings, etc. 

Denny’s concerns are not really valid. The behavior of an LLM is highly configurable. Literally all you have to do to prevent simple hallucinations in most cases is add a system prompt telling the LLM to not respond if it doesn’t know the answer. In RAG bots it’s even easier since you can force a function call to query your database. 

The main limitation of RAG is you can only inject so much text into a prompt without exceeding token limits, plus the more you inject, the more you pay. The next level up would be training/fine tuning an LLM on the entire collection, but that’d be a hell of a lot more complicated to develop a deploy. Preparing the training/testing data would be a massive pain in the ass. 

A good test case I think would be setting up an RAG bot on a smaller dataset, like WC testimony only, or something like that. Heck you could probably just grab the text off the McAdams site. For best results though I’d probably convert each hearing to JSON or something, and include embedded metadata, etc. That way you could ask the AI more focused questions, and format your prompt injection to reflect detailed info on each question/answer pair. 

Thanks Tom, this was exactly the sort of insight I was looking for. Like I say, my knowledge and understanding of AI is fairly limited but based on what I do know my hunch is that it could be of some use to JFKA researchers (certainly in the future as AI becomes more sophisticated). 

Link to comment
Share on other sites

I have thought for a long time that it would certainly be possible, if someone or a group knew what they were doing, to create a giant database from all the pics and videos from Dealey Plaza that day. If none of those pics or videos have been tampered with (and maybe even if they have to some degree) you could have a computer program to recognize all the information in each media source and put it all together. You'd basically have a 360° model type situation and you'd know what all people were where in the crowd at a given point in time. And if the information doesn't match up (like different people in the crowd from one media source to another in the same time frame) then at least we'll have definitive proof that at least some of the pics and videos have been tampered with to some degree. And you could digitally enhance each picture or series of pictures (video) using information from other sources. Like, for instance, say the limo is blurry in one source but not in another at the same timeframe. The program could possibly recognize that and use the information from the clear image to make the image clear in the blurry one. You get what I'm saying? I know I've watched a lot of documentaries on restoring classic movies like say The Wizard Of OZ. They do similar things like that to restore the film for future releases. I realize that something like this could take years or maybe even a decade or more, but what are we waiting for? We have to start somewhere! You could take different images of the Grassy Knoll area from different media sources during timeframes close together and say this is the factual data we have. This is where the fence is. This is where the concrete is. Here are the trees. Clarify all certain images until you only have objects that aren't "set". Possibly a person or an unidentified object that is in one source but not in another seconds later. So you know it's not a "constant" artifact to those surroundings. I may be explaining what I'm thinking poorly so I hope my point is not getting lost, lol! I just think we are not using the available technology near enough in JFK assassination research! We need people trained in these areas to work on some of this stuff.

Link to comment
Share on other sites

Greek to me. And to most older Americans.

I recently viewed a news clip of reporters asking members of our Congress and Senate how much they knew about AI?  Seemed almost everyone questioned stood with blank stares and several flat out admitted..."not much."

Reading Tom Gram's posts above he threw out acronyms that were so foreign to me I  broke out laughing. Yet, he is so sharp this was a language he easily understands inherently.

Obviously, Tom Gram is clearly our man to turn to regards AI on the forum.

Advanced Facial Recognition technology could be a game changer in the case. Scan every photo known to exist of the Dealey Plaza crowd. See what pops up.

Rip Robertson? E. Howard Hunt? The "Dark Complexion" man? Who knows. Scan every Lee Oswald leaflet photo in New Orleans. Will Bill Shelley pop up? 

Takes money though. More than I could ever come up with.

Edited by Joe Bauer
Link to comment
Share on other sites

First generation researcher Richard E. Sprague had one of the largest private collections of photographs and videos/audio of the assassination.


"This series contains approximately 460 photographs, slides, and films concerning the JFK assassination. There are also audio tapes containing interviews and radio broadcasts. Each visual image has its own number in the Photographic Archives (PA) system-- a system which Sprague developed and published in a 1970 article in Computers and Automation. Some items are oversized and stored separately; these items are marked with an "O." The key to abbreviations provides information on other listings such as "N" for negative and "S" for slide. The film and television footage in this series were placed on one videotape to preserve the originals."

from https://www.archives.gov/research/jfk/finding-aids/sprague-papers.html#photographs

I checked the National Archives site a few weeks back to see if they'd digitized Sprague's collection, they haven't. It's really important to see these first generation photos. Most of the photo files online are cropped or crappy quality. There could be so much more important details that are missing through image & film cropping.

The best quality res photos are going to be the key to matching faces.

Link to comment
Share on other sites

56 minutes ago, Robert Reeves said:

It's really important to see these first generation photos. Most of the photo files online are cropped or crappy quality. There could be so much more important details that are missing through image & film cropping.

The best quality res photos are going to be the key to matching faces.

Right on! 

I even thought about starting a "Go Fund Me" to drum up funds to submit some of the most well known photos of ominous characters in Dealey Plaza and of Oswald leaflet passing bystanders to a top Facial Recognition company for analysis.

If there was a hit on one or even more that revealed that these suspect men were right there close to Oswald ...or in the Dealey Plaza crowd the fund money would be more than well spent. If not one hit was made however...not so good.

 

Link to comment
Share on other sites

7 hours ago, Ben Green said:

Firstly, apologies if this topic has already been covered off elsewhere, but I'm interested to know if any JFKA researchers have used/or are using AI as part of their work, or how valuable researchers believe AI could be in this quest.

If it is being used - how so? And what specific areas of JFKA research do people think AI could be a useful tool for?

Given the sheer volume of documents, research, books, audio, still and moving images, and other resources pertaining to the assassination, my view is AI could be very useful indeed. Possibly even a game changer.

I see potential particularly around thinks like image enhancement, facial recognition and also modelling the likelihood of various scenarious.

Would be interested to know the thoughts of others.

Used intelligently, it "can be" and "has been" a very helpful and productive tool when working with equations.

Link to comment
Share on other sites

AI to make our lives better?

Like computers and smart phones have over the last 30 years?

From time to time I look back at my life and others back in the 1950's to early 60's.

Every time now and more and more I am convinced that during the 50's and 60's our lives here in America were so much better in so many ways.

Number one...everything was affordable. Especially rents and all the other basic need costs. 

The cost of living just for basic needs now is a crushing stress on over half the country daily. Over years it is totally exhausting.

Young people today by the tens of millions are not getting married or having kids. They can't afford doing so. Heck, tens of millions can't even afford their own apartment because monthly rent is more than their entire take home pay.

Saving to buy a home? Ha! How can anyone save when these basic needs costs take everything you earn?

Electronic devices have crushed physical in person social interaction. That is a huge negative in so many ways. Kids in school are more socially cut off than ever before. Depression and even suicide are more common than ever before.

Medical care back then was good enough. Don't recall anyone complaining about it anymore than now. Telephones were fine. Pay phones were everywhere and only a dime if you needed to get in touch with someone right away. There weren't endless recordings played when you called needed persons or services. Government ones especially. 

Jobs were plentiful. Physical labor ones as much as professional ones. College education was affordable. Today these costs are insanely high. 

No high cost Cable TV like today. Everyone had outdoor antennas and TV viewing was FREE! Okay, we didn't have 200 stations to choose from but 90% of what is out there today is inane.

Privacy today is so corrupted and compromised from those times.

Today, your computer is a surveillance tool. Your telephones. Your business actions...credit card use, credit loans and applications, etc. Company installed car chips. License plate reading cameras watching you everywhere 24/7. Roads, businesses, private residences ... even from the sky. Key words in your conversations with others.  All in the name of ...security?

You had personal privacy back in the 1950's and 60's. There was a moral good in that.

I could go on and on and on. Our lives HAVE NOT improved overall from the 1950's and 60's until today 60 years later. Technology had made most people's daily lives much more complicated, expensive and stressful IMO anyways.

Technology that has been monetized to a blood sucking greed degree before it's even released to the public.

Link to comment
Share on other sites

7 hours ago, Tom Gram said:

Denny’s concerns are not really valid. The behavior of an LLM is highly configurable. Literally all you have to do to prevent simple hallucinations in most cases is add a system prompt telling the LLM to not respond if it doesn’t know the answer. In RAG bots it’s even easier since you can force a function call to query your database. 

You claim there are very simple and easy fixes to prevent AI from making up a false answer to a question, yet the answer it gave to a very simple question was still false.

Why should we trust AI if the programmers can't already program in what you describe as simple and easy fail-safes that would say "I don't know" in a circumstance where it doesn't know an answer? According to you, the fix is so simple and easy that you can dismiss concerns about it in two sentences, and what I'm guessing are a few lines of code. Yet these fixes were not implemented. Are the programmers stupid? Is that really it? If code to avoid false answers is as simple and easy as you claim, why wasn't it already there? More importantly: Why is avoiding false answers not already a priority for AI?

I'm no AI expert, but I fail to see how false answers help anyone. And you're not really inspiring confidence in overall AI programming if there exist super simple fixes that apparently didn't even occur to the programmers to add. In my opinion, false answers are just going to lead to more confusion, uncertainty, and wasted time.

If we ask AI a complex question for which we can't independently verify the answer, how can we trust any answer that it gives? We just have to trust that the programmer got the programming right? Or do we have to comb through the code ourselves?

Edited by Denny Zartman
Link to comment
Share on other sites

4 minutes ago, Denny Zartman said:

You claim there are very simple and easy fixes to prevent AI from making up a false answer to a question, yet the answer it gave to a very simple question was still false.

Why should we trust AI if the programmers can't already program in what you describe as simple and easy fail-safes that would say "I don't know" in a circumstance where it doesn't know an answer? According to you, the fix is so simple and easy that you can dismiss concerns about it in two sentences, and what I'm guessing are a few lines of code. Yet these fixes were not implemented. Are the programmers stupid? Is that really it? If code to avoid false answers is as simple and easy as you claim, why wasn't it already there? More importantly: Why is avoiding false answers not already a priority for AI?

I'm no AI expert, but I fail to see how false answers help anyone. And you're not really inspiring confidence in overall AI programming if there exist super simple fixes that apparently didn't even occur to the programmers to add. In my opinion, false answers are just going to lead to more confusion, uncertainty, and wasted time.

If we ask AI a complex question for which we can't independently verify the answer, how can we trust any answer that it gives? We just have to trust that the programmer got the programming right? Or do we have to comb through the code ourselves?

The problem is on the end user, not the model programmers. OpenAI for example has a feature for adding system prompts to their basic ChatGPT interface. You just have to know how to use it. 

An LLM is just a text generation tool. It’s not magic. It’s only as good as the data you put into it, and unless you tell it not to, it will always generate text in the best way it knows how. 

LLM knowledge is also cut off at the date of its training data, so if you ask it a question about something that happened after that date, it will not know. One way to get around that is using retrieval augmented generation i.e. RAG, which is when you inject additional source data into your prompt, like from a web search tool (or a JFK database) to give the model more context to answer your question. 

Point is, unless you tell an AI not to answer unless specific criteria are met, it will always try to answer. It’s just how LLMs work. With RAG it’s extra easy since you can tell the AI to respond only if the specific answer exists in your provided source data. 

Here’s a pretty good article on hallucinations and how to prevent them: 

https://medium.com/@harish8383/how-to-minimize-hallucinations-in-ai-models-with-effective-prompts-aba5cf173a77

Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×
×
  • Create New...