top of page

Correction: AI Intelligence System Architecture

AI Intelligence System is designed to collect and consolidate information from many data sources and then process the data to deliver results for end users' questions


The whole system compose into four parts:


Data collection and extraction into a central database

  1. There are four main types of information in the world: books, News media, Journal articles, websites

  2. According to Google, there are close to 130 million books in the world. Selective books will be manually upload into the system. Who have the sufficient knowledge to decides which books to be used and uploaded?

  3. Newspaper, magazines and journal articles can be extracted by News feed services or RSS feed readers that data contents can be transferred into another computer system

  4. According to Digitialsilk, there are more than 1.1 billion website. Who have the sufficient knowledge to decides which websites to be extracted. What is the selection criteria?

  5. Commercial and organizational websites information can be extract via web scraping tools. Most of the data feed services and web scraping tools are owned by western high technology firms.

  6. It is a huge effort to write numerous data extraction programs to pull massive data from internet and books and populate into the AI data model.


Data selection and censorship

  1. The raw data sources are most likely being manipulated to filter unwanted and sensitive information

  2. A group of professionals should constantly analyze the data sources and make censorship decision

  3. The raw data can also be altered in favor to their interest


Data Text Analytics, Translations and set up rules and logic

  1. The raw data sources would be translated from original language to 184 languages in the world.

  2. Which translator is being used by the AI? Is the translator reliable? If they use Google translate, it is certainly not reliable.

  3. A group of subject matter experts would analyze the data mostly text contents and set up dedicated rules and logic for specific topics

  4. Different AI tools provide different answer because their rules and logic are different


AI Algorithm to deliver results to end users

  1. A group of programmers write AL Algorithm to extract data from the central database to answer the questions

  2. AI tool would answer the end users' questions based on the predefined rules and logic


Which part of the AI Intelligence System consume the most human resources? Certainly the data collection and data text analytics!


DeepSeek was founded on May 2023 with less than 200 domestic Chinese employees. The CEO of DeepSeek Liang Wenfeng is an engineer. With this limited resource in 2 years, they can't even manage to understand the information and subject matter categories in the world.






Comments


Donna Leung @ donnaunique.info

bottom of page