dev-resources.site
for different kinds of informations.
Comprehend AI - Elevate Your Reading Comprehension Skills!
This is a submission for the Cloudflare AI Challenge.
What I Built
Developing strong reading comprehension skills is crucial for navigating today's information-rich world. Comprehend AI is a web app which helps you to practice your reading comprehension skill by giving you a set of multiple-choice questions, generated from any web articles.
Demo
https://comprehend-mvp.pages.dev/
My Code
I used two repos, each for the frontend and the backend.
Frontend: https://github.com/fjoeda/comprehend-mvp
Backend: https://github.com/fjoeda/comprehend-engine
Journey
Comprehend AI frontend was built on the top of ReactJS, while the engine (backend) was built with Python using django-ninja
as the web API framework and Cloudflare Workers AI for the AI services. The engine behind Comprehend AI consists of two main parts namely the article retriever and the question generator.
The Article Retriever
The article is retrieved from the web page using data scrapping technique, which was done with the help of beautifulsoup4
library. The data scrapping process was simply done by scrapping the text from the html p
tags since it will contain the paragraph. The paragraphs of the article are stored in a list from which an element is randomly selected to provide the question generator with context for creating a question about a specific part of the article.
For the article source, we select the article randomly from phys.org, an online science news aggregator, by utilizing the RSS to get the list of article items and URLs.
The Question Generator
Two model were used for the question generator, @cf/mistral/mistral-7b-instruct-v0.1
as the main model and @cf/meta/llama-2-7b-chat-int8
when the main model endpoint fails (which I faced during the development process). In my own experience, @cf/mistral/mistral-7b-instruct-v0.1
provides the consistent response since I need the multiple-choice question in JSON format. @cf/meta/llama-2-7b-chat-int8
also did well, though it occasionally precedes the JSON formatted response with a statement. Thanks to the "no yapping" prompt trick, the model will directly give me the JSON format response. The question generator will give a question regarding certain part of the article, the correct answer, and the decoy options.
Challenge
Generating a multiple-choice question can be challenging especially when generating the decoy options. The decoy option should appear as plausible as possible to present a more challenging question. At present, the decoy sometimes seems too obvious. This could present an opportunity for research, specifically in the area of generating decoys for multiple-choice questions.
Future Works
Currently the article for the question generation is taken from 1 source. There are some options that I want to try, (1) give an additional feature that allows users to input their own article URL and generate questions from that source, or (2) scrapping a random Wikipedia page and ask the LLM model to summarize and create the fully generated article. Well, let's hear it from you!
Featured ones: