This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy

Issue096

Volume 01 (2024)

symbol t

Foreword

Li Peng

Li Peng

Corporate Senior Vice President and President of ICT Sales & Service, Huawei

slot gacor itu apa

In the age of AI, the core carrier of 6G services and applications will shift from the mobile Internet and smartphone apps to AI agents across various sectors. This means AI will serve as a bridge to 6G.

Progress in foundation models and AI is promoting digital transformation across industries, while laying a foundation for the future of communications technology and playing a vital role in the shift towards 6G.

In June 2023, the International Telecommunication Union (ITU) completed the recommended framework for the 6G vision, which answers the question "What is 6G?" from two aspects[1]. First, 6G will continue to evolve mobile communications and expand in three usage scenarios, i.e., enhanced Mobile Broadband (eMBB), Ultra-Reliable and Low-Latency Communication (URLLC), and massive Machine Type Communication (mMTC), to provide immersive and deterministic communication experiences and support massive connections. Second, 6G will go beyond the scope of mobile communications to achieve integrated sensing and communication, the integration of AI with communications, and ubiquitous space-air-ground integrated connectivity. These advances will allow us to "observe" the physical world in ways that exceed human limitations and create digital twins in the virtual world. The 6G vision is an embodiment of global consensus and a key milestone on the path towards a globally-unified 6G standard.

Within the three scenarios beyond communications, integrating AI with communications mainly focuses on how 6G can be designed to natively support massive AI services and applications in the future. Over the next five to ten years, 99% of all development, design, and administrative tasks is expected to be done by AI. In the near future, foundation models will even replace manual architecture design chip. This future trend will overlap with the window for 6G deployment. Technological innovation in 4G LTE brought us into the mobile Internet age, in which smartphone apps were the main carriers of applications and services. In the age of AI, the core carrier of applications and services will shift from mobile apps to AI agents.

AI agents can sense and proactively take action. Capable of sensing, learning, and acquiring knowledge, they can set action objectives based on the environment and constantly improve their capabilities. The recent success of foundation models has taken AI agent capabilities to a new level, going beyond just generative AI, to creating interactive AI capable of complex dialogues and decision-making. Therefore, in the 6G era, networks will power not only AI agents, but artificial general intelligence (AGI). Huawei's vision of Connected Intelligence (Figure 1), proposed in 2019, assumed support for native-AI capabilities and involves two aspects: AGI for 6G and 6G for AGI[2].

Figure 1: Connected Intelligence = AGI for 6G + 6G for AGI

This article covers both of these aspects, with a particular focus on 6G for AGI. As shown in Figure 2, 6G for AGI looks to explore areas like how to design communications capabilities like eMBB+, URLLC+, and mMTC+, and how to use networks' sensing capabilities to better support AI and make 6G into the neural center that connects future AI agents and a key part of AI learning, training, and inference. If 6G is to succeed in these areas, the 6G system needs to be designed with an architecture that expands beyond connectivity and must integrate the four basic functions of AI agents: sensing, cognition, decision-making, and action. This architecture should use efficient, intent-based communications to closely integrate the physical and digital worlds and thus influence how the physical world operates.

Figure 2: 6G's AI-native capabilities and the 6G for AGI framework

AGI for 6G

In the 6G era, the basic model of communication for AGI will be based on effectiveness communication[3], as proposed by Warren Weaver, or intent-based communication. Such a framework would go beyond Claude Shannon's model of communication which involves the transmission of only bits. Bits do not represent understanding and are not intelligent, which essentially outlines how AGI-enabled 6G communication differs from traditional communication.

AI agent-powered 6G communications can be broken down into four types:

  1. Human-to-human system-1 and system-2 communication
  2. Machine-to-machine intent-driven communication
  3. Human-to-machine ultra-reliable low-latency communication
  4. Machine-to-human spatial-computing-based metaverse communication

To effectively support an AI-agent-powered 6G communication framework, 3GPP standard design must consider uplink channels that support both sensing and learning, as well as downlink channels that support inference, low latency, and metaverse applications. The remainder of this section will focus on the first two types of 6G communication. The other two types are similar to the first two in terms of how they use the AI-agent-based framework, but differ in terms of specific communication requirements, depending on scenarios.

AI-agent-assisted human-to-human communication

In the future, foundation-model-based communication will organically integrate communications in both the physical and digital worlds. This will give rise to a post-Shannon-model AGI communication architecture. Human-to-human communications, for example, are grounded in two core concepts:

  • First, everyone can use a foundation model, which is a generative pre-trained transformer (GPT), as an agent.
  • Second, each person's foundation model can use a GPT and spatial computing to generate a multimodal virtual real-time response, representing a proxy response of the person's deterministic behavior. In the digital world, a communication system for such GPT-based agents is known as system 1. It is worth noting that foundation models cannot accurately learn and model non-deterministic behaviors such as emotions. A communication system that transmits such information in the physical world in real time is known as system 2, which can also be designed based on GPT foundation models.

Each person can release their own foundation model, allowing people to access each other's models before communicating. This facilitates system-1 communication between foundation models, which is essentially local communication that does not use wireless communication resources. Wireless channels will be used for communication in cases where foundation models fail to generate what's necessary for system-2 communication.

6G communications between AGIs (shown in Figure 3) include internal-channel communications based on the Shannon model and external-channel communications between neural networks, between foundation models, and between agents. It should be noted that radio air interfaces are increasingly being powered by GPUs, at the cost of higher power consumption without higher performance.

Figure 3: Post-Shannon-model AGI communication architecture based on GPT models

Human-to-human communications assisted by AI agents are an advanced method of interaction that uses powerful AI capabilities to enhance and optimize the communication process. Within this framework, everyone will have two major GPT foundation models: system 1 is used for local intelligent processing, and system 2 for physical communications.

First, everyone will need to train the GPT foundation model used by their system-1 AI agent, as well as the GPT foundation model used for system-2 physical communication. Such training will primarily involve supervised offline learning. Training based on a general-purpose foundation model in a broader sense can enable continuous updates and ensure humans are kept in the loop, making the resulting GPT model more accurate and powerful. Second, an emergence detector will be required to detect whether system 1 is working properly. Parts that system 1 cannot learn or model will be distributed to system 2 for learning.

These will create an architecture that combines both fast and slow communications. System 1 is fast communication that facilitates local closed-loop communication between the AI agents of both ends without occupying communication resources (shown in Figure 4). System 2 is slow communication and can use wireless channels (shown in Figure 5). It should be noted that the use of an AGI-based intent-driven communication mechanism allows system-2 communication to reduce data traffic by 100-fold or even 1,000-fold compared with direct video communication. Furthermore, the AI agent of both system 1 and system 2 can constantly update the general-purpose foundation model.

Figure 4: System 1 based on a GPT foundation model ‥C Fast local communication

Figure 5: System 2 based on a GPT foundation model ‥C Slow physical communication

AI-agent-assisted machine-to-machine communication

In terms of machine-to-machine communication, uploading visual-sensing results (e.g., complete videos and point clouds) to support foundation-model computing on the edge or cloud will result in a huge amount of uplink traffic, limiting the number of machines that can be supported. To combat this, a primary AI agent can be used on devices for the purpose of real-time token alignment with foundation models on the cloud through wireless channels to facilitate AI agent collaboration across devices, pipes, and cloud, thus realizing massive machine-to-machine communications (shown in Figure 6).

Figure 6: 6G-powered, intent-driven machine-to-machine communication

Specifically, an AI-agent-based post-Shannon-model communication framework uses intent-driven communication according to the following steps:

  • Step 1: Use an AI agent on devices to perform primary preprocessing and analysis of the scenario, which is also known as goal-oriented filtering, in order to clean sensing data in real time.
  • Step 2: Perform embedding in the transformer foundation model on the extracted objects to obtain the simplified mathematical descriptions (tokens) of intents.
  • Step 3: Transfer data back to the edge or cloud through wireless channels to align intents (represented by tokens) on both ends in real time, thereby realizing efficient machine-to-machine communication with device-pipe-cloud synergy.

6G is essentially about integrating communication, AI, and sensing to create a neural center for numerous AI agents.

Compared with direct video transmission, this transmission mechanism can reduce data traffic by 100-fold or even 1,000-fold, increasing the number of communicating users the system can support by an order of magnitude.

6G for AGI

6G sensing provides big data sources for AI learning

6G's integrated sensing and communication (ISAC) is another unique advantage of applying AI agent services on 6G networks. ISAC brings new opportunities to wireless communication systems??providing wireless sensing services while supporting communications. Native convergence of sensing and communication enables mobile base stations and devices to obtain a larger sensing scope and higher sensing accuracy through collaborative sensing, without additional spectrum or increased equipment costs. With shorter radio wavelengths, broader spectrum resources, and larger antenna apertures, 6G can support the highly-accurate, real-time reproduction of the physical environment as a service. This capability can also help significantly reduce transmission power consumption while enhancing wireless transmission performance.

Data extracted from 6G sensing can be used for modeling the physical world in areas the network can reach, as well as providing a source of big data for AI learning (shown in Figure 7). People, machines, vehicles, buildings, materials, and even weather can be objects of 6G sensing. Wireless sensing can provide big data on the environment through parameter estimation, imaging, and even mass spectrometry, all of which are transmitted over radio waves. Attention and study are both essential for sensing across the entire communications spectrum, including centimeter wave, millimeter wave, and sub-THz bands. THz technology has the potential to see wide adoption in high-precision sensing.

Figure 7: 6G sensing is a major data source for future AI foundation models

6G-based intelligent and inclusive A-RAN and A-Core

Foundation models for natural language processing, represented by ChatGPT, will match and exceed human capabilities in the near future. However, such human-like intelligence will be possible only when supported by the computing power provided by supercomputing clusters. For example, 500 billion neural network parameters require over 10 million watts of power supply to run. In the next 10 years, it is unlikely that such foundation models will be able to run on mobile devices.

6G networks must be able to deliver inclusive intelligent services for all people and all things, anytime and anywhere. This requires 6G networks to adopt a C+A+S (communication + AI + sensing) architecture powered by foundation models. The convergence of communication, AI, and sensing is a key feature of 6G. The post-Shannon-model communication architecture that supports AGI (i.e., the C+A+S radio access network) is called A-RAN (shown in Figure 8).

Figure 8: A-RAN architecture: Communication-AI-sensing converged access network

6G networks can also be built with AI agents, with each serving as a logical network element (NE). Huawei's 2012 Laboratories proposed the concept of application-driven networks in 2015. The core idea behind this is automatically generating customized networks for customers' application requirements, and cancelling such networks when the applications are no longer needed. The infrastructure is like a unified computing platform that stretches across the entire network. It is a task-based network architecture and the prototype of 6G A-Core (as shown in Figure 9).

Figure 9: A-Core: Task-based network architecture

Creating a real-time twin world with 6G AGI

In the digital world, virtual copies of physical entities are built based on application intent in order to simulate and analyze real-world behavior and performance, in what we call digital twins. This section covers two technologies related to digital twins. The first is building digital twins that can mirror the physical world in real time through intent-driven communication. The second is performing accurate spatiotemporal inference about the physical world based on the real-time twin world. Both technologies have the potential for wide application across numerous scenarios like self-driving vehicles, robotics, smart industrial production, and telemedicine.

6G devices can be placed in a small physical space, such as a small room, to support integrated sensing and communication provide the AI computing power required to run a small-scale AI model. These devices can sense the physical world through wireless signals. By processing the sensing signals, the devices can generate a point cloud that both depicts and describes the physical world.

Then, with a given intent, the physical world is selectively reproduced through AGI communications between an AI foundation model in the network and the small AI model running on the 6G devices. The twin world in question does not have to completely describe the physical world, but is capable of identifying things relevant to the intent. This reduces communication overhead, while protecting personal privacy and keeping information about the location confidential.

The sensing system senses point-cloud information, which is lattice representations of objects in the three-dimensional space. To reduce the amount of data to be transmitted and protect user privacy, we upload semantic spatiotemporal information to the cloud for fusion only when the semantic target matches the original sensing information. That is why we have developed a unique semantic spatiotemporal fusion and prediction algorithm that can fuse semantic information to form a digital twin. The algorithm enables a digital twin to mirror the information of concern in almost real time, while reducing the required uplink transmission bandwidth by orders of magnitude.

Object positioning and human-pose tracking can already be driven by natural language through a real-time twin world. Unlike ChatGPT, such a system can sense the physical world in real time, perform semantic, temporal, and spatial inference, and then present the results in the form of natural language. For example, the positions of an object and person can be displayed in the digital twin in real time in the form of a rendered animated object and figure. A 6G AGI system can use the sensed point cloud information to identify actions relevant to preset intents (e.g., reading a book), and locate the spatial positions of such occurrences in real time. Continuous improvements to the sensing system will further increase inference accuracy.

In conclusion, we believe that 6G is essentially about integrating communication, AI, and sensing to create a neural center for numerous AI agents. We also believe that the following elements of 6G native-AGI communication will be essential in the future foundation model era:

  • Connected Intelligence = AGI for 6G + 6G for AGI
    • AGI for 6G: Effectiveness communication powered by the post-Shannon-model communication architecture
    • 6G for AGI: An inclusive intelligent neural center that integrates AI learning, training, and inference
  • 6G A-RAN which integrates communication, sensing, and AI
  • Task-based 6G A-Core that is built with agent NEs

References

  1. Recommendation ITU-R M.2160-0, Framework and overall objectives of the future development of IMT for 2030 and beyond.
  2. Wen Tong and Peiying Zhu, "6G: The Next Horizon From connected people and things to connected intelligence," Cambridge University Press, May 2021.
  3. C. E. Shannon and W. Weaver, The Mathematical Theory of Communication. The University of Illinois Press, 1949.
slot gacor itu apa | 下一页