AI Intelligence System is designed to collect and consolidate information from many data sources and then process the data to deliver results for end users' questions
The whole system compose into four parts:
Data collection and extraction into a central database
There are four main types of information in the world: books, News media, Journal articles, websites
According to Google, there are close to 130 million books in the world. Selective books will be manually upload into the system. Who have the sufficient knowledge to decides which books to be used and uploaded?
Newspaper, magazines and journal articles can be extracted by News feed services or RSS feed readers that data contents can be transferred into another computer system
According to Digitialsilk, there are more than 1.1 billion website. Who have the sufficient knowledge to decides which websites to be extracted. What is the selection criteria?
Commercial and organizational websites information can be extract via web scraping tools. Most of the data feed services and web scraping tools are owned by western high technology firms.
It is a huge effort to write numerous data extraction programs to pull massive data from internet and books and populate into the AI data model.
Data selection and censorship
The raw data sources are most likely being manipulated to filter unwanted and sensitive information
A group of professionals should constantly analyze the data sources and make censorship decision
The raw data can also be altered in favor to their interest
Data Text Analytics, Translations and set up rules and logic
The raw data sources would be translated from original language to 184 languages in the world.
Which translator is being used by the AI? Is the translator reliable? If they use Google translate, it is certainly not reliable.
A group of subject matter experts would analyze the data mostly text contents and set up dedicated rules and logic for specific topics
Different AI tools provide different answer because their rules and logic are different
AI Algorithm to deliver results to end users
A group of programmers write AL Algorithm to extract data from the central database to answer the questions
AI tool would answer the end users' questions based on the predefined rules and logic
Which part of the AI Intelligence System consume the most human resources? Certainly the data collection and data text analytics!
DeepSeek was founded on May 2023 with less than 200 domestic Chinese employees. The CEO of DeepSeek Liang Wenfeng is an engineer. With this limited resource in 2 years, they can't even manage to understand the information and subject matter categories in the world.
Comments