SemanticCommunication/paper2.txt

1511 lines
72 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

2478 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Hybrid Digital-Analog Semantic Communications
Huiqiang Xie , Member, IEEE, Zhijin Qin , Senior Member, IEEE, Zhu Han , Fellow, IEEE,
and Khaled B. Letaief , Fellow, IEEE
Abstract—Digital and analog semantic communications (Sem- andubiquitousconnectedintelligence.Tomeetthesedemands,
Com) face inherent limitations such as data security concerns target key performance indicators [2] have been proposed,
in analog SemCom, as well as leveling-off and cliff-edge effects aiming to ensure the seamless integration of these advanced
in digital SemCom. In order to overcome these challenges,
technologies in the next generation of mobile communication
we propose a novel SemCom framework and a corresponding
systems, e.g., 107 devices/km2 for connectivity, 60 b/s/Hz
system called HDA-DeepSC, which leverages a hybrid digital-
analog approach for multimedia transmission. This is achieved for spectral efficiency, and 100 us for end-to-end latency.
through the introduction of analog-digital allocation and fusion To materialize this vision, semantic communications [3] have
modules. To strike a balance between data rate and distortion, been envisioned as one of the potential technologies due to
wedesignnewlossfunctionsthattakeintoaccountlong-distance the low semantic errors, high spectral efficiency, and high
dependencies in the semantic distortion constraint, essential
transmission rates. By exchanging semantic information at
information recovery in the channel distortion constraint, and
both ends, semantic communications can reconstruct sources
optimal bit stream generation in the rate constraint. Addi-
tionally, we propose denoising diffusion-based signal detection or directly perform tasks with the tolerance of transmission
techniques, which involve carefully designed variance schedules errors. According to the communication paradigm, seman-
and sampling algorithms to refine transmitted signals. Through tic communications can be categorized into two categories:
extensivenumericalexperiments,wewilldemonstratethatHDA- analog semantic communications and digital semantic com-
DeepSC exhibits robustness to channel variations and is capable
munications.
of supporting various communication scenarios. Our proposed
Analog semantic communications [4], [5], [6], [7], [8],
framework outperforms existing benchmarks in terms of peak
signal-to-noise ratio and multi-scale structural similarity, show- [9], [10], [11], [12] convey the semantic information using
casing its superiority in semantic communication quality. continuous signals, which takes advantage of deep learning
Index Terms—Semantic communications, multimedia trans- (DL) to design end-to-end systems and maps the source to
mission,analogcommunications,digitalcommunications,hybrid the non-fixed-size constellations directly. There exist many
digital-analog communications. works for different modal data transmission. Xie et al. [4]
have developed a DL based semantic communication system,
I. INTRODUCTION
named DeepSC, for text transmission, in which the sentences
AS MOBILE communication systems transition from the
are mapped to the embedding vectors and then transformed
fifth generation (5G) to the sixth generation (6G), there
to the learned non-fixed-size constellation points. Yi et al.
is a need to address the evolving requirements of seamlessly
[5] introduced the explicit knowledge base to the DeepSC as
integrating virtual/augmented reality, remote control robots,
the side information and integrated the knowledge base into
the end-to-end optimization, achieving the higher bilingual
Received 15 May 2024; revised 16 December 2024; accepted 15 January
2025. Date of publication 10 April 2025; date of current version 19 June evaluationunderstudy(BLEU)scoreatthelowsignal-to-noise
2025. This work was supported in part by the National Key Research and ratio (SNR) regions. Weng et al. [6] have proposed an end-to-
Development Program of China under Grant 2023YFB2904300; in part by
end semantic communication system for speech recognition
the National Natural Science Foundation of China (NSFC) under Grant
62401227 and Grant 62293484; in part by Guangzhou Municipal Science and speech synthesis tasks, named DeepSC-ST. The speech
andTechnologyProjectunderGrant2025A04J3380;inpartbyFundamental signals are processed by the DeepSC-ST and output the con-
Research Funds for the Central Universities under Grant 21624349; in part
tinuous constellation points at the transmitter. Grassucci et al.
by the Hong Kong Research Grants Council under the Areas of Excellence
[7]havedesignedagenerativeaudiosemanticcommunication
Scheme under Grant AoE/E-601/22-R; in part by NSF ECCS-2302469,
Toyota;inpartbyAmazon;andinpartbytheJapanScienceandTechnology framework,whichtransmitsthecontinuousembeddingvectors
Agency (JST) Adopting Sustainable Partnerships for Innovative Research togeneratetheaudiosatthereceiver.Daietal.[9]haveinves-
Ecosystem (ASPIRE) under Grant JPMJAP2326. An earlier version of this
tigated the end-to-end image transmission problem, in which
paper was presented in part at the IEEE Globecom Workshop 2024 [1].
(Correspondingauthor:ZhijinQin.) the image is non-linearly transformed into continuous signals
Huiqiang Xie is with the College of Information Science and with different lengths. Wu et al. [11] have investigated the
Technology, Jinan University, Guangzhou 510632, China (e-mail:
end-to-end image transmission for multiple-inputs multiple-
huiqiangxie@jnu.edu.cn).
Zhijin Qin is with the Department of Electronic Engineering, Tsinghua outputs(MIMO)channels.Similarly,theimagesareconverted
University, Beijing 100084, China, also with the State Key Laboratory of into continuous semantic features and adaptively assigned to
Space Network and Communications, Beijing 100084, China, and also with
different subchannels based on the channel state information
Beijing National Research Center for Information Science and Technology,
Beijing100084,China(e-mail:qinzhijin@tsinghua.edu.cn). (CSI). Wang et al. [12] have proposed a video semantic
ZhuHaniswiththeDepartmentofElectricalandComputerEngineering, communication system, in which the semantic features of
UniversityofHouston,Houston,TX77004USA,andalsowiththeDepart- frames are extracted into continuous signals and transmitted
ment of Computer Science and Engineering, Kyung Hee University, Seoul
using analog communication methods.
446-701,SouthKorea(e-mail:hanzhu22@gmail.com).
Khaled B. Letaief is with the Department of Electronic and Computer The continuous signals in analog semantic communications
Engineering, The Hong Kong University of Science and Technology, Hong have two benefits. One is to allow gradient propagation and
Kong,China(e-mail:eekhaled@ust.hk).
enable end-to-end optimization. The other is that the contin-
DigitalObjectIdentifier10.1109/JSAC.2025.3559149
©2025TheAuthors.ThisworkislicensedunderaCreativeCommonsAttribution4.0License.
Formoreinformation,seehttps://creativecommons.org/licenses/by/4.0/
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2479
uous signals have a high degree of freedom that provides the Q1: Howtoenhancedatasecurityandalleviatetheleveling-
smoothness performance optimization varying from channel off and cliff-edge effects?
conditions,enablingbetterrobustnessinthelowSNRregimes. Q2: How it be compatible with purely analog and digital
However, continuous signals also have flaws. The commer- semantic communication?
cial encryption algorithms are designed for discrete signals, Q3: How to support the various communication environ-
e.g., bit streams, raising concerns about the data security of ments, e.g., the wide bandwidth scenario or the weak
continuous signal-based systems. Besides, in some scenarios communication scenario?
that require accurate transmission at the bit level, analog The concept of hybrid digital-analog (HDA) joint source-
semantic communications cannot meet the requirement due to channel codes [22] was proposed by Mittal et al. in 2002,
theapproximatelyinfinitecandidatesetsincontinuoussignals. which proves that HDA codes are capable of theoretically
Therefore,digitalsemanticcommunicationshaveattractedthe achievingtheShannonlimit(theoreticallyoptimumdistortion)
attention of researchers. and a less severe leveling-off and cliff-edge effects. Since
Digitalsemanticcommunications[13],[14],[15],[16],[17], then, the HDA codes have attracted much attention from
[18],[19],[20],[21]transmitsemanticinformationinthetype academicsandindustries[23],[24],[25],[26],[27].Skoglund
of discrete signal, which maps the source to bit streams or et al. [24] have proposed HDA codes for the bandwidth
fixed-size constellations. Tung et al. [13] have proposed the compression scenarios, and Ko¨ken et al. [26] have analyzed
quantizedjointsource-channelcodingforimagetransmission, therobustnessofHDAcodeswithbandwidthmismatch.HDA
namedDeepJSCC-Q,bymappingthecontinuoussignalstothe transmission is also adopted in the Japanese and Canadian
close points in the fixed-size constellations to be compatible television signal transmission [28], where video and speech
with some protocols. Similarly, Bo et al. [14] improved the signals are transmitted by analog and digital transceivers,
quantized joint source-channel coding by learning transition respectively. Yu et al. [29] have designed the HDA joint
probabilityfromsourcedatatodiscreteconstellationsymbols, source-channel coding for scalable video transmission, named
inwhichtheGumbel-Maxsamplingisemployedtosamplethe WSVC. which takes the 2D discrete wavelet transform for
constellation points from the learned transition probability so analog transmission and H.264/AVC for digital transmission.
that avoiding the non-differentiable quantization. Guo et al. Lan et al. [30] have formulated the video transmission distor-
[16] quantized the semantic information with the learnable tionsfirstandthenproposedasub-optimalresourceallocation
non-linear scalar quantizer, which learns to adopt dynamic scheme, which allocates the power and quantization bits. Tan
quantizationlevelsfordifferentsemanticvalues.Fuetal.[18] etal.[31]haveproposedtheoptimalresourceallocationforthe
have proposed the vector quantized semantic communication Internet-of-things (IoT) scenario. Three factors are optimized
system, in which the semantic vectors are quantized into toenhancethequalityoftherecoveredimage,includingdigital
bit streams with the learnable vector quantizer and trans- bandwidth,orthogonalpower,andnonorthogonalpowerofthe
mitted with the digital channel codings and modulations. analog signal. Yahampath [32] has considered the imperfect
Gao et al. [20] have developed an adaptive modulation and channel state information (CSI) for the video transmission, in
retransmission scheme by deriving the relationship between which the digital power is allocated by considering the CSI
bit-error-rate and the task performance, in which the seman- errors, and the remaining power is used to transmit superim-
tic information is quantized into fixed-length bit streams. posed analog QAM symbols. However, these works rely on
Huang et al. [21] have proposed an iterative training algo- linear transforms and ignore the semantic information behind
rithm for digital semantic communications, in which the deep data,whichisunsuitablefornon-linearsemantictransmission.
source codec are trained according to the chosen channel Inspired by the concept of HDA codes, we propose a novel
coding rate. framework called DL-based HDA semantic communication.
The above works on digital semantic communication This framework integrates the strengths of both analog and
achieve accurate transmission at the bit or symbol level and digital semantic communications to effectively tackle the
part of the works can apply the encryption algorithms to challenges mentioned earlier. Firstly, the HDA semantic com-
encrypt the bit streams. However, digital semantic commu- municationsystemscanimprovedatasecurityandalleviatethe
nication systems introduce unavoidable quantization errors leveling-off and cliff-edge effects by transmitting part infor-
due to the process of quantizing continuous signals to dis- mation with the continuous signals in analog communications
crete signals, which introduces the leveling-off effect. That is, (Q1). Besides, analog and digital semantic communications
the quality of the decoded source signal is limited because are special cases of HDA semantic communications. By con-
of the quantization errors. Besides, digital semantic com- trolling the ratio between analog and digital components, the
munications experience the cliff-edge effect varying from HDA semantic communications not only can be transformed
different channel conditions, which usually results in a drastic into purely analog or digital semantic communications (Q2)
degradation in performance at lower SNRs. Therefore, it is but also support the different communication scenarios (Q3).
imperative to adopt a new semantic communication paradigm The main contributions are summarized as follows:
that can address the limitations of both analog and digi-
A novel HDA semantic communication framework is
tal semantic communications. This paradigm should enhance •
proposed, which takes advantage of analog and digital
data security and mitigate the leveling-off and cliff-edge
semantic communications and addresses the limitations
effects. However, designing such a semantic communication
inherent in each.
system poses several challenges that need to be overcome,
Based on the HDA semantic communication framework,
namely, •
we propose an HDA semantic communication system,
---PAGE BREAK---
2480 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Fig.1. Theproposedhybriddigital-analogsemanticcommunicationframework.
named HDA-DeepSC, for multimedia transmission, in Given an image, I R3 H W, where H and W are the
× ×
which the new analog-digital allocation and fusion mod- height and width of the image. The semantic information can
ules are proposed to generate the analog and digital be extracted by
components.Besides,thenewlossfunctionsaredesigned z= (I;α ), (1)
t
S
tocapturethelocalandglobalinformation,alleviatingthe
where z RM 1 is the semantic information and (;α ) is
distortions from channels, and balancing the source rate. ∈ × S · t
denotedasthesemanticencoderwiththeparameterα .Then,
To further improve the quality of the recovered images, t
• z is split into two parts with analog-digital allocation module
we proposed a diffusion-based framework enhanced sig-
by
nal detection by designing the variance schedule and
[z ,z ]= (z;θ ), (2)
sampling algorithm. A D A t
Based on extensive simulation results, the proposed wherez andz arethesemanticinformationtransmittedby
• A D
HDA-DeepSC outperforms the conventional and DL- the analog transmitter and the digital transmitter, respectively.
based communication systems and improves the system (;θ ) is analog-digital allocation with parameters θ .
t t
A ·
robustness at the low SNR regime. 1) Analog Transmitter: The encoded symbols for analog
The rest of this paper is organized as follows. The sys- semantic transmission are represented as
tem model is introduced in Section II. The HDA semantic
x = (z ;β ), (3)
transmission is proposed in Section III. Section IV details the A C A A t
proposed diffusion-based signal detection. Numerical results where x
A
CLA× 1 is the encoded complex symbols and
are presented in Section V to show the performance of (;β ) is denoted as the analog channel encoder with the
C A · t
the proposed frameworks. Finally, Section VI concludes this parameter β .
t
paper. 2) Digital Transmitter: The entropy coding and quantizer
Notation: Bold-font variables denote matrices or vectors. will be employed firstly to convert z into bit streams by
D
Cn m and Rn m represent complex and real matrices of size
× ×
n m, respectively. (µ,σ2) means circularly-symmetric b= E ( Q (z D )), (4)
× CN
complex Gaussian distribution with mean µ and covariance where b is the bit streams, () and () are denoted as
σ2. (µ,σ2) means Gaussian distribution with mean µ and the quantizer and entropy enc Q od · er, resp E ec · tively. Then, b is
N
covariance σ2. (a,b) means continuous uniform distribution encodedwithdigitalchannelencoders(e.g.,LDPCcodes)and
U
between a and b. () denotes the conjugate operation. x[k] fixed-size constellations (e.g., 16-QAM) by
·
represents the k-th element in the vector.
x = ( (b)), (5)
D D
M C
II. SYSTEMMODEL wherex D CLD× 1 istheencodedsymbols, ()represents
∈ M ·
the fixed-size modulation, and () is denoted as the digital
D
AsshowninFig.1,weconsiderasingle-inputsingle-output C ·
channel encoder.
(SISO)communicationsystem,whichaimstosendmultimedia
With the analog and digital symbols, the transmitted sym-
overtheair.TheproposedHDASemComframeworkconsists bols are x = [x ,x ] CL 1, where L = L +L . The
A D × A D
of the HDA transmitter, the wireless channel model, and the bandwidth compression ∈ ratio is defined as η = L .
HDA receiver, which employs both digital semantic transmis- 3 × H × W
sion and analog semantic transmission.
B. Wireless Channel Model
When x is transmitted over the block fading channels, the
A. The Hybrid Digital-Analog Transmitter
received signal can be given by
The HDA transmitter consists of a semantic encoder that
y =hx+n, (6)
extracts the semantic information behind images, analog-
digital allocation that allocates the semantic information for wherehisthechannelcoefficientthatremainsconstantwithin
analog and digital transmission, and channel encoders that a channel coherence time, n is the additive white Gaussian
protect the information over the air. noise(AWGN),inwhichn 0,σ2I .FortheRayleigh
CN n L
(cid:0) (cid:1)
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2481
fadingchannel,thechannelcoefficientfollowsh (0,1); A. Model Design
CN
for the Rician fading channel, it follows h µ ,σ2
CN h h TheproposedHDA-DeepSCisshowninFig.2.Thedesign
with µ h = r/(r+1) and σ h = 1/(r+1), where (cid:0) r is th (cid:1) e of each module is detailed below.
Rician coefficient. The SNR is defined as E( x 2)/E( n 2). 1) Semantic Codec: The semantic encoder comprises a
(cid:112) (cid:112) (cid:107) (cid:107) (cid:107) (cid:107)
convolutional layer and a residual Swin Transformer block.
C. The Hybrid Digital-Analog Receiver The first convolutional layer projects the images into vector-
shaped tokens, which are used as inputs to the residual Swin
The receiver comprises signal detection that estimates the
Transformer block in a permutation-invariant manner. Then,
transmittedsymbols,aanalog-digitalfusionmodulethatfuses
the residual Swin Transformer block consists of several Swin
the digital and analog semantic information, channel decoders
Transformerlayersandaconvolutionlayer,inwhichtheSwin
that alleviate the distortions from the wireless channels, and a
Transformer layer [33] originates from the Transformer and
semantic decoder that recovers the images with the received
introduces the local attention and shifted window mechanism
semantic information.
to improve the visual semantic understanding. Besides, a con-
Withtheleastsquares(LS)signaldetection,thetransmitted
volutional layer with spatially invariant filters in the residual
symbols can be estimated by
block can enhance the translational equivariance. The residual
h h connection allows for aggregation of the shallow and deep
xˆ = y =x+ n, (7)
h2 h2 semantic features.
| | | | Similarly,thesemanticdecoderconsistsoftheresidualSwin
where xˆ = [xˆ ,xˆ ] represents the estimated symbols. We
A D Transformerblock,convolutionallayers,andpixelshuffle.The
assume that h is the perfect CSI. After the signal detection,
residualblockistoenhancethevisualsemanticunderstanding.
the semantic features are recovered by the analog and digital
The residual connection provides a short connection from
receivers, respectively.
the semantic encoder to the semantic decoder, allowing the
1) Analog Receiver: The semantic features transmitted by
processingofreconstructiontofusevaryinglevelsoffeatures.
analog communications are estimated by
The convolutional layers and pixel shuffle form the recon-
ˆz A = CA 1(xˆ A ;β r ), (8) s
u
t
p
ru
sa
c
m
tio
p
n
les
m
t
o
h
d
e
u
f
l
e
e
a
,
tu
in
re
w
a
h
n
i
d
ch
pix
th
e
e
ls
s
h
u
u
b
f
-
fl
p
e
ix
r
e
e
l
al
c
lo
o
c
n
a
v
t
o
e
l
s
ut
t
i
h
o
e
n
f
a
e
l
a
l
t
a
u
y
re
e
s
r
where zˆ A is the estimated semantic features and CA 1( · ;β r ) to reconstruct the transmitted images.
is denoted as the analog channel decoder with parameter β r . 2) Analog-Digital Allocation and Fusion: At the trans-
2) Digital Receiver: For digital semantic transmission, the mitter, the analog-digital allocation module transforms the
transmitted bit streams are recovered firstly by original semantic information into essential and auxiliary
bˆ= CD 1 M 1(xˆ D ) , (9) s p e la m y a s n a ti n c im in p fo o r r m tan at t io ro n l . e T in he bu e il s d s i e n n g tia th l e s i e m m a a g n e t s ic an i d nf t o h r e m o at t i h o e n r
where CD 1(
·
) represents th(cid:0)e digital c(cid:1)hannel decoder and parts of semantic information work to improve the quality of
1() is denoted as the fixed-size demodulation. Then, the the image. The essential part includes the basic information
M ·
semantic features transmitted with digital semantic transmis- about the image, e.g., the low-frequency information, and
sion are recovered by needs to be delivered accurately and cryptographically. Only
the essential part cannot be obtained, the image cannot be
ˆz D = 1( 1(bˆ)), (10) built. However, the nature of analog semantic transmission is
Q E
where 1() and 1() are denoted as the entropy decoder continuous signals and not compatible with discrete encryp-
E · Q · tion algorithms. Therefore, the essential part is transmitted
and dequantizer, respectively.
With zˆ and zˆ , the semantic features are fused by accurately by digital communication systems, in which the
A D
data encryption methods (e.g., symmetric cryptography and
ˆz= 1(ˆz A ,ˆz D ;θ r ), (11) asymmetric cryptography) can be applied to encrypt the bit
A
streams to guarantee the data security of the essential part.
wherezˆ istherecoveredsemanticinformationand 1(;θ )
A · r A hyper codec is proposed to extract the essential part of
isrepresentedastheanalog-digitalfusionmodulewithparam-
the original semantic information, which is given by
eters θ .
r
Finally, the transmitted image can be reconstructed by z = (z;θ ), (13)
D t
H
Iˆ = S 1(ˆz;α r ), (12) where H (z;θ t ) is denoted as the hyper encoder. As shown
in Fig. 2, the hyper encoder employs two convolutional layers
where 1(;α )representsthesemanticdecoderwithparam-
S · r to downsample the original semantic information, such that
eter α .
r enables a larger receptive field and extracts the essential
semantic information.
III. HYBRIDDIGITAL-ANALOGSEMANTICTRANSMISSION
Theauxiliaryparthelpsimprovethequalityoftherecovered
In this section, we design an HDA semantic communica- image,whichistransmittedbyanalogcommunicationsystems
tionsystem,namedHDA-DeepSC,forheterogeneouswireless withthefollowingbenefits.Analogcommunicationsystemsdo
communication environments. Then, we develop the new loss not have a cliff effect and are suitable for optimizing systems
function to train the HDA-DeepSC with the proposed training inanend-to-endmanner.Toextracttheauxiliarypart,wefirst
algorithm. analyze the entropy of z conditioned on z˜, H(z z˜), which
|
---PAGE BREAK---
2482 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Fig.2. Thestructureoftheproposedhybriddigital-analogsemanticcommunicationsystem.
qualifies the uncertainty about z when z˜ is known. In other The design of analog-digital allocation and fusion can also
words, it measures the remaining information of z when z˜ is be viewed as a coarse-to-fine processing. The digital and
known. The lower bound of H(z z˜) is derived by analog component transmits coarse and auxiliary semantic
|
information about the basics and supplements of the image,
H(z ˜z)=H(z,˜z) H(˜z)
| respectively. The receiver fuses the coarse and auxiliary
H(z) H(˜z), (14) semantic information to obtain fine semantic information,
which is used to recover the high-fidelity images.
where the equals hold when z˜ is close to z. z˜ =
3) Digital Transceiver: The quantizer module rounds ele-
1 1( (z ));θ is the recovered semantic infor-
H Q Q D r ments of z to the nearest integer, z˜ . Then, the arithmetic
mation based on essential part without consideration of D D
(cid:0) (cid:1) coding converts z˜ into bit streams, in which the arithmetic
transmission errors. 1(;θ ) is denoted as hyper decoder, D
r
H · coding is one kind of entropy coding. The entropy coding
wheretwoconvolutionallayersareemployedtoupsampleand
requires the distribution of z˜ in advance. Similarly to [34],
recover the basic semantic information. D
we model z˜ using a non-parametric, fully factorized density
Byobserving(14),wecanobtaintheremaininginformation D
model by
of z when z˜ is known, i.e., the auxiliary part, by
1 1
z A =z ˜z, (15) p(˜z D | ψ)= p ˜zD[i] | ψ[i] ψ[i] U 2 , 2 (˜z D [i]),
where z A is transmitted by analog communications. The (cid:89) i (cid:18) (cid:18) (cid:19)(cid:19) (17)
derivation is in Appendix A.
At the receiver, the analog-digital fusion module is where ψ[i] is the parameters of each univariate distribution
employed to obtain the fine semantic information by fusing p . Like most cases, we model the quantization errors
the essential and auxiliary parts, which is given by w z˜ i D th [i] t | h ψ e [i] uniform distribution. Therefore, we convolve each
ˆz= 1(ˆz A ,ˆz D ;θ r )= 1(ˆz D ;θ r )+ˆz A , (16) non-parametric density with a standard uniform density to
A H better match the prior of z˜ .
D
where 1(;θ ) shares the same weights with the hyper For digital channel codec and modulation, we adopt the
r
H ·
decoder in the transmitter. adaptive modulation and coding for different SNRs.
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2483
4) Analog Transeiver: The analog channel codec aims to p(z)logp(z)dz
compress the semantic features and transmit them effectively
(cid:90)
over the air. Similarly to the previous works [35], the analog = p(z)p(ˆz z)logq(z ˆz ,θ )dzdˆz +H(z)
D D r D
channel codec mainly employs the fully connected layers | |
(cid:90)
to transmit the semantic information due to global semantic =E E [logq(z ˆz ,θ )]+H(z).
z
p(z) ˆzD p(ˆzD| z)
|
D r
information preservation. Compared with the convolutional (20)
neural network (CNN) layer to capture the local information,
where the inequation follows KL[p(z zˆ ),q(z zˆ ,θ )] 0,
the dense layer is good at capturing global information and | D | D r ≥
in which KL[, ] is the Kullback-Leibler (KL) divergence and
preserving the entire attributes, which follows the target of · ·
q(z zˆ ,θ ) is the variational approximation of p(z zˆ ).
the analog channel codec. This can enhance the systems | D r | D
For the sake of argument, assume for a moment that the
robustness to channel noise.
likelihood is given by
B. Loss Function Design q(z ˆz D ,θ r )= z,(2λ z ) 1I , (21)
| N
The wireless multimedia transmission problem can be
(cid:16) (cid:17)
viewed as the classical rate-distortion optimization problem, where z = 1(ˆz ;θ ). The log-likelihood then works out
D r
which includes distortion and rate constraints. H
to be the squared difference between z and z weighted by λ .
z
1) Loss Function Design for Distortion Constraints: The
Then, the I(z,ˆz ) can be rewritten as
D
distortion constraint can be categorized into semantic and
(cid:98)
channel distortion constraints. For semantic distortion con- I(z,ˆz ) λ E z z 2 +H(z)+constant. (22)
D z
straint, except for the pixel difference considered in most ≥− (cid:107) (cid:107)
(cid:104) (cid:105)
works, we further introduce the frequency difference of the Submitting (22) into (19) and omitting the constant, the
images. The designed loss function for semantic distortion can be written as
CD
L
constraint is given by
E[ z ˆz ]+λ E z z 2 H(z). (23)
=E I Iˆ 2+λ (I) (Iˆ) , (18) L CD ≈ (cid:107) (cid:107) z (cid:107) (cid:107)
SD
L (cid:107) (cid:107) F|F F | (cid:104) (cid:105)
(cid:104) (cid:105) If we freeze the semantic codec during training, H(z) can be
where λ is the weight and () represents the Fourier
F F · technically dropped out from CD .
transform. The first item in (18) refers to the pixel difference L
2) Loss Function Design for Rate Constraints: For rate
of the image, we assume that the pixels of the image follow
constraint, the analog transmitter designs the fixed-length
the Gaussian distribution without loss of generality and thus
output. Therefore, we consider the rate constraint for the
employ the mean-square error (MSE) loss. The second item
digital transmitter, which is given
in (18) refers to the frequency difference of the image, we
considerthelearningoflong-rangedependenciesoftheimage =E[ log(p(˜z ψ)))], (24)
Rate D
L |
and design the Fourier-based loss function. In detail, we
map the images into the frequency domain and compare the where p(z˜ D ψ) is given in (17). By minimizing the rate
|
difference between the original and transmitted images. The constraint, we can optimize the distribution of z˜ D and reduce
reasons behind the design can be summarized as the number of bits generated by the arithmetic coding.
The MSE loss guides the neural networks to recover
the local pixels of the images by comparing the pixel C. Training Details
difference,whichignoresthelong-rangedependenciesof
The proposed training algorithm is shown in Algorithm 1.
the image.
We adopt three-stage training methods. The first stage is
The Fourier-based loss can help the neural network learn
• the long-range dependencies of the image. Because the to train the semantic codec with the L SD , which enables
effectivesemanticextraction.Afterthesemanticcodecfinishes
same frequency in the frequency domain refers to the
training, the second stage is to train the hybrid transceiver
different pixels at the different positions of the image.
with +λ , which aims to reduce the distortions
CD r Rate
For the channel distortion constraint, we consider the L L
from physical channels as well as the number of bit streams.
distortions from channels and the transmission of essential
We can drop out the H(z) in since we freeze the
CD
information. The designed loss function is given by L
semantic codec during training. The non-differentiable opera-
=E[ z ˆz ] I(z,ˆz ), (19) tions, e.g., the quantization, entropy coding, and modulation,
CD D
L (cid:107) (cid:107)
will block the gradient back-propagation from receiver to
where the first item minimizes the distortions from chan-
transmitter. Therefore, we substitute additive uniform noise
nels and the second item maximizes the mutual information
for the non-differentiable operations itself during training,
between z and zˆ to make zˆ contains more information
D D i.e., z˜ = z +u in line 10 of Algorithm 1. Besides, we
D D
of z. However, directly optimizing the I(z,zˆ ) is hard. We
D choose the error-free transmission for the z˜ due to two
D
derive the lower bound of I(z,zˆ ) by
D factors, one is that the number of generated bit streams is
p(z ˆz ) muchsmallerthantheconventionalsourcecoding,e.g.,JPEG;
I(z,ˆz )= p(z,ˆz )log | D dzdˆz
D D p(z) D another one is the accurate bit transmission characteristic of
(cid:90)
digital communication. Finally, we train the whole network
≥ p(z,ˆz D )logq(z | ˆz D ,θ r )dzdˆz D with L SD +λ r L Rate to improve the quality of the recovered
(cid:90)
---PAGE BREAK---
2484 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Algorithm 1 HDA-DeepSC Training Algorithm
1 1
1
1
1
1
1 1
1 1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
1
2
3
4
5
6
7
8
9
0 1
2
3
4
5
6 7
8 9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
F
F
F
u
u
u
n
n
n
)( c e d o C ci t n a m e S ni a r T : n oi t c
I . t e s a t a d m o rf el p m a S : t u p nI
α ; I ( = z ) , t
S
ˆ 1 α ; z ( ) = I , r S
e t u p m h ) 8 t o i 1 w C ( , D S
L (cid:127) α α , h ti w t n e c s e d t n ei d a r n G i a r T . D S t r L
1 α ; ) ( ) α ; ( d n a . : n r u t e R t r S · · S )( r e vi e c s n a r T di r b y H ni a r T : n oi t c
m e S el p m a s d n a c e d o c ci t n a m e s e z e e r F : t u p nI
z . s e r u t a ef
: r e t ti m s n a r T
n oi t a c oll A l a ti gi D - g ol a n A / /
θ ; z ( = z ) , D t
H (cid:127) 1 1 ˜ , u u z = + z , , D D 2 2 U 1 ˜ ˜ θ z ( ) ; = z , D r
M̂ H ˜z z = z . A
r e t ti m s n a r T l a ti gi D / /
˜z si d t n ei d a r g di o v a o t e t e i m rf - s r n o a r r r e T D
r e t ti m s n a r T g ol a n A / /
β z = x ( ) ; , A A A t C , n oi t a zil a m r o n r e w o P
x .ri a t e i h m t s r n e a v r o T A : r e vi e c e R
˜ y z e vi e h d c t ) e n i 6 w R a ( . D A
r e vi e c e R l a ti gi D / /
˜ ˆ z = z , D D
(cid:127) 1 ˆ = θ z z ( ) ; . D r H
r e vi e c e R g ol a n A / /
ˆx y b n oi t c e t e d t l a e n g g ) o 7 i S t ( , A
1 ˆ ˆ z β x = ( ) ; . A A r A C
n oi s u F l a ti gi D - g ol a n A / /
(cid:127) ˆ ˆ + z z = z . A
λ + e t u p m h ) ) d 4 3 t o n i 2 2 w C a ( ( . et D a C R r
L L(cid:127)
β β θ θ , , , h ti w t n e c s e d t n ei d a r n G i a r T t r t r
λ + . et D a C R r
L L
1 θ ; ( β β ; ; ( ) ( ) ) d n a , , , : n r u t e R A t t r A · H C · · C
1 θ ; ( ) .
r
· H
)( k r o w t e N el o h W ni a r T : n oi t c
I . t e s a t a d m o rf el p m a S : t u p nI
ˆI . t e g o t 3 d n a , 8 2 - 8 , 2 s e nil t a e p e R
λ + e t u p m h ) ) d 8 4 t o n i 1 2 w C a ( ( . et D a R S r
L L (cid:127) α β β α θ θ , , , , n e c s e d t n ei d a r n G i a r T , t t r r t r
λ + .
et D a R S r
L L
. C S p e e D - A D H e h T : n r u t e R
a
p a
t
n
p
w
ci t
a e
h ti
.r
Algorithm 2 HDA-DeepSC Inference Algorithm
image and reduce the number of bit streams in an end-to-end
manner, which converges to the global optimization.
When the whole network has been trained, we can employ
the model to transmit the image wirelessly. The inference
algorithm is presented in Algorithm 2. We remove the addi-
tive uniform noise and replace it with the non-differentiable
operations.
The three-stage training algorithm ensures that each stage
can converge to the local optimum and avoids the mismatch
of gradient descent. Besides, the approximate quantized noise
1
1
1
1 1
1
1
1
1
1 2
2 2
2
2
2
2
2
1
2
3
4
5
6 7
8
9
0
1
2
3 4
5
6
7
8
9 0
1 2
3
4
5
6
7
F u n )( e c n e r ef nI C S p e e D - A D H n oi t c
I . t e s a t a d m o rf el p m a S : t u p nI
: r e t ti m s n a r T
α ; I ( = z ) . t S
n oi t a c oll A l a ti gi D - g ol a n A / /
θ ; z ( = z ) , D t H
˜ z ( = ) z , D D Q 1 1 ˜ ˜ = z θ z ; ( ( ) ) , D r H Q˜z
z = z . A
r e t ti m s n a r T l a ti gi D / /
˜z ( = b ) , D E
( )) b = x ( . D D C M
r e t ti m s n a r T g ol a n A / /
β z = x ( ) ; , A A A t C , n oi t a zil a m r o N r e w o P
x [ = x x ] , ti t m r s e n v a o r T D A
: r e vi e c e R
y h ti w e vi e c ) e 6 R ( .y
b n oi t c e t e d t l a e n g g ) o 7 i S t (
r e vi e c e R l a ti gi D / /ˆ 1 1 ˆ = b x ( ) , D D C M
ˆ 1 1 ˆ )) b = ( ( z , D M̂ Q E (cid:127) 1 ˆ = θ z z ( ) ; . D r
H
r e vi e c e R g ol a n A / /
1 ˆ ˆ z β x = ( ) ; . A A r A C
n oi s u F l a ti gi D - g ol a n A / /
(cid:127) ˆ ˆ z = z + z , A
ˆ 1 ˆ α = ; z ( ) I . r
S ˆI . : n r u t e R
h
ˆx
e
A
:
.ri a
n a d ˆx D .
helps avoid the disappearing gradient, which enables end-to-
end training. Moreover, the inference algorithm indicates that
the digital component can adopt the encryption algorithm to
protect the digital bits and the adaptive modulation coding
against channel distortions.
IV. DIFFUSIONFRAMEWORKENHANCED
SIGNALDETECTION
This section provides an overview of the de-noising dif-
fusion framework and its background. Subsequently, we
introduce a novel diffusion-based signal detection method
called DiffSDNet. DiffSDNet is developed by incorporating
a carefully designed variance schedule into the training and
sampling algorithms. The diffusion-based de-noise module is
the optional part of the HDA-DeepSC, which can further
improve the robustness of the HDA-DeepSC.
A. De-Noising Diffusion Framework
Given a random noise as input, the denoising diffusion
framework [36] models the generative processing through
multiple de-noising steps. Each step iteratively enhances the
generative results by removing the predicted noise, akin to
Langevin dynamics. The de-noising diffusion framework is
divided into forward process and reverse process.
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2485
1) Forward Process: The forward process is fixed to a B. The Proposed De-Noising Diffusion-Based Signal
MarkovchainwithT stepsthatgraduallyaddsGaussiannoise Detection
tothedataaccordingtoavariancescheduleγ , ,γ ,which
1 T The detected signals in (7) can be rewritten as
···
is given by
xˆ =x+n˜, (30)
v(0)
v(1)
v(2)
→···→
v(T 1)
v(T), (25)
where n˜ =
h
n is an effective noise after the signal detec-
h2
where v(0) is the input information, p v(t) v(t 1) =
tion.Weemp|lo|ytheblock-fadingchannelmodelin(6),where
the h keeps constant. Therefore, the n˜ follows a circularly
1 γ(t)v(t 1),γ(t)I , and p(v(T)) is modeled with
N (cid:0) (cid:12) (cid:12) (cid:1) symmetric complex Gaussian distribution with zero mean and
((cid:16)0 (cid:112) ,I).Duetothereparam(cid:17)eterizationofnormaldistribution, scaled variance, σ2 =σ2/h2.
v N (t) can be represented as Since the coeffi n˜ cients n o | f | p v(t) v(t 1) in (25) should
2
v(t) = 1 γ(t)v(t 1)+ γ(t)(cid:15)(t) satisfy 1 γ(t) +γ(t) =(cid:0)1, we(cid:12) rewritt(cid:1)en xˆ as
(cid:12)
=(cid:112)1 γ¯(t) 2 v(0)+(cid:112)γ¯(t)¯(cid:15)(t), (26)
(cid:16)(cid:112)
x˜ =
(cid:17)
1 x+ σ n˜ (cid:15), (31)
√1+σ √1+σ
(cid:113) n˜ n˜
(cid:0) (cid:1)
where ¯(cid:15)(t) (0,I) and γ¯(t) = 1 (1 γ(t)). where x˜ =xˆ/√1+σ and n˜ =σ (cid:15),(cid:15) (0,I).
N t=1 n˜ n˜ CN
Observe (26), the forward process recu(cid:113)rrently adds the Gaus- Comparing (31) with (26), we find that the wireless trans-
(cid:81)
sian noise step by step to make v(0) approach the normal mission is similar to the forward process. We model x and x˜
distribution, which can be viewed as the encoding processing in (31) as v(0) and v(t) in (26). It is natural to employ the
without learnable parameters. reverseprocesstorefinex˜,suchthatobtainsthemoreaccurate
2) Reverse Process: The reverse process is also defined as x. Given the x˜ and σ n˜ , we adopt (27) to remove the noise
a Markov chain with T steps starting at v(T), which is given in x˜ to closer the x. However, the existing variance schedule
by of p v(t) v(t 1) and sampling algorithm are unsuitable for
wireless communications. We need to design the variance
v(T) v(T 1) v(T 2) v(1) v(0), (27) sched (cid:0) ule a (cid:12) (cid:12)nd sam (cid:1) pling algorithm by considering the channel
→ → →···→ →
SNR.
where q v(t 1) v(t) = µ v(t);ω ,σ(t)I . The reverse 1) Variance Schedule Design: A variance schedule refers
N
processg (cid:0) enerates (cid:12) the (cid:1) v(t 1) b (cid:0) as (cid:0) edonv( (cid:1) t),inw (cid:1) hichthemean to the way in which the mean and variance of the added
of v(t 1) is mo(cid:12)deled with neural network with the v(t) as noisechangesoverthecourseofthediffusionprocess.During
input. this process, the mean and variance of the added noise is
From (26), we can observe that v(t 1) can be predicted adjustedateachstep,affectingtheamountofnoiseintroduced
with v(t) and v(0) by removing the added noise. Therefore, at each stage, therefore variance schedule determines how the
µ v(t);ω can be modeled as noise level evolves during the diffusion process. A variance
schedulecanimpactthequalityofgeneratedxandthemodels
(cid:0) µ(v(t) (cid:1) ;ω)= 1 v(t) γ(t) (cid:15)(v(t);ω) . (28) convergence behavior.
1 γ(t) γ¯(t) The variance schedule should satisfy the γ¯(T) 0. Based
(cid:18) (cid:19) ontheconstraint,wedesignthevarianceschedulew → ithT =50
where (cid:15) v(t);ω (cid:112) predicts the noise added to v(t). From (28), steps, which is given by
thereverseprocesspredictstheGaussiannoiseateachstepand 0.5t
thenrem (cid:0) ovesthe (cid:1) predictednoisetorestorethev(0) fromv(T) γ(t) = , (32)
T
with learnable parameters, which can be viewed the decoding
processing. which γ¯(50) e 6.375 1 . The designed variance schedule
≈ ≈
The loss function for the diffusion-based model at step t is includes 50 different noise levels. The reasons behind the
defined as designed variance schedule can be summarized as
Compared with the conventional diffusion-based frame-
2 •
(t) =E ¯(cid:15)(t) (cid:15) 1 γ¯(t) 2 v(0)+γ¯(t)¯(cid:15)(t);ω,t . workwith1,000stepsforgenerativetasks,weempirically
LDiff (cid:34)(cid:13) (cid:18)(cid:113) (cid:19)(cid:13) (cid:35) find that the de-noise task does not need too many steps
(cid:13) (cid:0) (cid:1) (cid:13)(29) due to the low complexity of the de-noise task.
(cid:13) (cid:13)
(cid:13) (cid:13) Wedesignamonotonicfunctionofγ(t)toachievecoarse-
During training, we sample the t first and model the v(t) to-finede-noiseprocessing,whichhasanunequalinterval
with v(0) by adding the Gaussian noise with the scheduled SNR, e.g., a small interval in high SNR regions and a
variances. large interval in low SNR regions. The unequal interval
Compared with the previous de-noise frameworks, e.g., SNR can speed up the de-noise processing with fewer
DnCNN, that predict the noise with only one step, the steps at low SNR regions.
de-noising diffusion framework can predict the noise with 2) Sampling Algorithm: The sampling algorithm performs
multiple steps, such that matches the distributions of noise the reverse process by sampling the steps. For example, the
and achieves better performance of de-noise. Therefore, we conventionaldiffusion-basedframeworkusuallysamples1,000
propose a de-noising diffusion-based signal detection method. steps from T 0 [36] or 100 steps with the subsequence
---PAGE BREAK---
2486 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Algorithm 3 Dynamic Sampling Algorithm
1
2
3
4
5
F u n a n y D n oi t c
e h T : t u p nI
˜x e zil ai ti nI
˜t b e h t d ni F
(cid:127) ˜t = t r of
) 1 t ( = v
0( v : n r u t e R
)( g nil p m a S ci m
˜xl a n gi s d e t c e t e d
p g ni t r a t s e h t s a
σ y h ) 3 ti 3 w ( . n˜
1 o d (cid:127)
1 ) t ( v √ )t( γ 1
)
o
:d
n a
v, t ni
)t( γ )t( γ¯
σ n˜˜)
t (
(cid:127) v (
..
( t ) ; ω )
(cid:127)
of T 0 [37], in which v(T) is the first sampled step.
However,startingfromv(T) isunsuitableforsignaldetection.
The detected signals will start from different v(t) where t
depends on the received SNR at the receiver. Therefore, we
proposeadynamicsamplingalgorithmshowninAlgorithm3.
Fig. 3. The PSNR performance comparison between Analog DeepSC and
Firstly, given the known σ n˜ , we search the starting point t˜ AnalogDeepSCwithdifferentdenoisersontheKodakdataset.
at the reverse process, which is given by
Low-density parity check (LDPC) coding and
1 [γ¯(t˜+1),γ¯(t˜)].
(33)
capacity-achieved coding are used for the channel
√1+σ
n˜ coding.
Then, the signal detection aims to recover the transmitted The adaptive modulation and coding (AMC) is
signals as more accurate as possible. Therefore, we change employed for different SNRs, including 1/2 coding
the random sampling to deterministic sampling. In detail, ratewithBPSK,1/2codingratewithQPSK,3/4cod-
we reduce the degree of randomness in the reverse process ing rate with QPSK, 1/2 coding rate with 16QAM,
by setting σ(t) in (27) equals to zero, which means that and 3/4 coding rate with 16QAM.
the q v(t 1) v(t) changes from µ v(t);ω ,σ(t)I to Analog semantic communication systems: The purely
deterministic µ v(t);ω . N • analogsemanticcommunicationofHDA-DeepSCtrained
(cid:0) (cid:12) (cid:1) (cid:0) (cid:0) (cid:1) (cid:1)
(cid:12) with MSE loss.
(cid:0) (cid:1)
V. NUMERICALRESULTS Digital semantic communication systems: The
In this section, we compare the proposed HDA-DeepSC DeepJSCC-Q proposed in [13].
with DL-based semantic communication systems and digital ConventionalHDAtransmissionsystemswith2Ddiscrete
communication systems over AWGN and Rician fading chan- cosine transform and scaler quantization [30].
nels, where we assume the perfect CSI for all schemes. Denoising convolutional neural network (DnCNN) as the
one-step de-noise benchmark
The LDPC codes we use are from the 802.11ad standard,
A. Implementation Details
with blocklength 672 bits for both the 1/2 and 3/4 rate codes.
1) The Dataset: We choose the DIK2K dataset [38] for
The coherent time is set as the transmission time for each
training, which contains 1,000 images with different scenes.
image in the simulation. We set r =1 for the Rician channels
The Kodak dataset is used for testing.
and h=1 for the AWGN channels. Peak signal-to-noise ratio
2) Training Settings: The semantic codec consists of 6
(PSNR) and multi-scale structural similarity (MS-SSIM) are
Swin-Transformer layers, respectively. Each layer is with 6
used as the metrics to measure the local and global quality of
heads and a width of 120. The diffusion-based model adopts
images. The unit of MS-SSIM is dB by
the structures of OpenAI-UNet. The λ , λ , and λ is 0.1,
z r
0.1, and 0.0005, respectively. The learn F ing rate is 2 × 10 4. MS SSIM(dB)= 10log 10 (MS SSIM). (34)
ThedeviceforsimulationconsistsofIntelR XeonR Platinum
(cid:13) (cid:13)
8352V and the NVIDIA GeForce RTX 4090. The encryption B. Denoising Networks Comparisons
algorithm is AES encryption.
Fig. 3 presents the PSNR performance for the analog
3) Benchmarks and Performance Metrics: We adopt the
DeepSCwithdifferentdenoisers.Firstobservethattheanalog
separatesource-channelcoding,theDL-basedanalogsemantic
DeepSC with denoiser has a larger PSNR than that without
communication system, the DL-based digital semantic com-
denoiserinthelowSNRregimes.Thisvalidatestheeffective-
municationsystem,andtheone-stepdenoisingnetworkasthe
ness of the denoiser in reducing the noise level. For the small
benchmarks, which are detailed as follows.
noise level at the high SNR regimes, the analog DeepSC is
Separate source-channel coding: Employ the source and capableofrestoringthesignalsthereforeallmethodsachievea
channel coding separately to transmit the images, we use similar PSNR as the SNR increases. Furthermore, we observe
the following technologies, respectively: thattheanalogDeepSCwithDiffSDNetoutperformsthatwith
Better Portable Graphics (BPG) for image source DnCNN with 0.6dB in terms of PSNR. This suggests that the
coding, the state-of-the-art image compression multiple-step denoiser has a stronger power of denoising than
method. the one-step denoiser.
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2487
Fig.4. ComparisonbetweenHDA-DeepSCandtheAnalogDeepSC,DeepJSCC-Q,andBPGwithdifferentchannelcodingontheKodakdatasetoverAWGN
channels.
Fig.5. ComparisonbetweenHDA-DeepSCandtheAnalogDeepSC,DeepJSCC-Q,andBPGwithdifferentchannelcodingontheKodakdatasetoverRician
channels.
TABLEI channels with a 1/6 bandwidth compression ratio. For AWGN
THEPSNRCOMPARISONBETWEENTHEANALOGDEEPSCWITH channels, we can see in Fig. 4 that our HDA-DeepSC out-
DIFFERENTDIFFUSION-BASEDDENOISERSATSNR=0DB performs all the benchmarks. This indicates that the discrete
signalsof the digital component canaccurately delivercrucial
semantic information for details recovery and the continuous
signals of the analog component can prevent the leveling-off
and cliff-edge effects for lower quantization errors. Besides,
the HDA-DeepSC achieves the best performance in terms of
MS-SSIM,whichmeansthattheimagestransmittedbyHDA-
TableIshowsthecomparisonbetweenanalogDeepSCwith DeepSC have better global quality. This is likely because we
DDPM and DiffSDNet. The proposed DiffSDNet can achieve introducetheFourier-basedlossfunctionthatmakesthemodel
higher PSNR with fewer sampling steps than the DDPM, learn the long-distance dependencies. For the Rician channel
confirmingtheeffectivenessofthedesignedvarianceschedule case shown in Fig. 5, we observe that the DL-based analog
and sampling algorithm. Especially, the PSNR of analog systems are more robust to channel changes due to the high
DeepSCwithDDPMwilldecreaseasthenumberofsampling degree of freedom in continuous signals, in which the HDA-
steps increases. This is due to the high degree of randomness DeepSC is beneficial from the analog component. Moreover,
introduced in the reverse process. thelowbandwidthconsumptionofthedigitalpartallowsusto
uselow-ratechannelcodingtoachieveaccuratedeliverywhile
transmitting a small number of symbols, such as ensuring
C. Communication System Comparisons robustness in the low SNR regimes. This is the reason why
Figs. 4 and 5 report the PSNR and MS-SSIM comparison we assume error-free transmission while training the digital
betweenthevariousmethodsoverAWGNchannelsandRician part. Besides, if the communication environment is terrible,
---PAGE BREAK---
2488 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
Fig.6. PSNRandMS-SSIMperformancefordifferentbandwidthcompressionratiosontheKodakdatasetoverAWGNchannels.
Fig.7. PSNRandMS-SSIMperformancefordifferentbandwidthcompressionratiosontheKodakdatasetoverRicianchannels.
TABLEII codec and train the semantic codec with MSE loss function.
THEABLATIONSOFFOURIER-BASEDCOMPONENT:MSELOSS, The Fourier-based module or loss can improve the quality
MSELOSSWITHFOURIER-BASEDMODULE,ANDMSE of images with more than 2dB in terms of PSNR and MS-
LOSSWITHFOURIER-BASEDLOSS
SSIM due to the long-distance dependencies learning in the
frequencydomain.Besides,weobservethattheFourier-based
loss can largely increase MS-SSIM than the Fourier-based
module. The reason behind that is the Fourier-based module
introduces the additional Fourier-based parameters making it
challenging to further improve its performance. This suggests
that Fourier-based loss can directly capture the global infor-
mation of images without additional parameters and hence as
in which the digital signals cannot be successfully decoded, an attractive loss to improve the global quality of images.
this system will experience the cliff-edge effect due to the
employed entropy coding. This can be improved in several
D. Bandwidth Compression Ratio Comparisons
ways. One is to replace the entropy coding module with the
learning-based quantization module. Another is to introduce Figs. 6 and 7 demonstrate the comparisons for different
error transmission during training. Both methods can lead the bandwidth compression ratios over AWGN and Rician chan-
model to learn to correct the errors in digital transmission. nels at SNR=10 dB. The HDA-DeepSC outperforms all the
Visual examples are presented in Appendix B. benchmarks in terms of PSNR and MS-SSIM. For example,
In Table II, we study the ablations of Fourier-based com- the HDA-DeepSC achieves the same PSNR as separate cod-
ponents by only considering the semantic codec, in which the ings (the BPG with 1/2 LDPC and 16QAM) with a 33%
MSE loss with Fourier-based module means that we insert improvement on bandwidth compression ratio. This suggests
the pluggable Fourier-based modules [39] into the semantic that the HDA-DeepSC can provide a higher data transmission
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2489
Fig.8. PSNRandMS-SSIMperformancefordifferentdigital-analogratiosontheKodakdatasetoverAWGNchannels.
Fig. 9. Visualized examples for different methods transmitted over AWGN channels at SNR=10dB: (a) original image; (b) image recovered by BPG with
1/2LDPCand16QAM;(c)imagerecoveredbyHDA-DeepSCwith0.2DAratiousingunencryptedbits;(d)-(f)imagerecoveredbyHDA-DeepSCwith0.2,
0.87,and3DAratiousingencryptedbits,respectively.
rate than the benchmarks for a given PSNR or MS-SSIM. TABLEIII
Besides, we find that the learning-based methods outperform THEPSNRPERFORMANCEFORTHEENCRYPTEDANDUNENCRYPTED
theBPGintermsofMS-SSIM,indicatingtheneuralnetworks BITSOVERAWGNCHANNELSATSNR=10DB
operate as the better content generator, thereby generating the
image with global consistency.
E. Digital-Analog Ratio Comparisons
Fig.8showsthecomparisonsacrossdifferentdigital-analog
(DA) ratios by changing the ratio between the number of information. This suggests that the analog transmitter oper-
transmitted symbols of digital and analog components, where ates as a continuous signal-based system, thereby effectively
the total number of transmitted symbols is fixed. The larger reducing the quantization errors by decreasing the DA ratio.
DA ratio means more semantic information is transmitted
with the digital transmitter and vice versa. We can observe
F. Data Security
that the PSNR and MS-SSIM decrease as the DA ratio
increases, which is caused by the unavoidable quantization Table III reports the PSNR performance for the encrypted
errorsintroducedbythedigitaltransmitter.Themoresemantic and unencrypted bits, where these terms refer to whether the
information transmitted through the digital transmitter, the encryption algorithm encrypts the bit streams transmitted by
larger the quantization errors introduced to the transmitted the digital transmitter. We assume that the eavesdropper is
---PAGE BREAK---
2490 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
TABLEIV methodtoreducethebitbudgetoflearnedconstellations,such
THERUNNINGTIMEPERIMAGECOMPARISONBETWEEN as achieving low-precision pseudo-analog transmission. The
THEHDA-DEEPSCANDBPG cost is slight performance degradation.
VI. CONCLUSION
Inthispaper,wehaveintroducedaninnovativeHDAseman-
tic communication framework that combines the strengths
of analog and digital semantic communications. Our frame-
incapable of decoding the encrypted bits and only decodes
work aims to overcome the inherent limitations associated
thesemanticinformationtransmittedbytheanalogtransmitter,
with each approach. Building upon the framework, we intro-
wheretheHDA-DeepSCmodelisknowntotheeavesdropper.
duced a robust HDA semantic communication system called
From Table III, the PSNR of encrypted bits is 20dB lower
HDA-DeepSC, specifically designed for multimedia transmis-
compared to that of unencrypted bits, indicating the images
sion. HDA-DeepSC leverages digital communication methods
recovered by encrypted bits are little like the original ones. In
to transmit crucial semantic information, ensuring accurate
other words, the eavesdropper obtains less information from
delivery and data security. Additionally, it utilizes analog
thesemanticinformationtransmittedbytheanalogtransmitter.
communication methods to transmit auxiliary semantic infor-
Besides, the PSNR of encrypted bits slightly decreases as
mation, effectively mitigating the leveling-off and cliff-edge
the DA ratio increases. This suggests that the HDA-DeepSC
effects associated with traditional approaches. We also intro-
effectively safeguards data with few bits while achieving the
ducedanalog-digitalallocationandfusionmodulestoseparate
high PSNR. Visual examples are presented in Figs. 9(c)-(f),
and fuse the digital and analog components, respectively.
where Figs. 9(d)-(f) are the images recovered by encrypted
Besides, we have designed the Fourier-based loss function to
bits.Interestingly,theessentialinformationisprotectedbythe
guide the model in learning the long-distance dependencies
HDA-DeepSC,e.g.thecolor,thebackground,andthetextures,
and combined the rate constraint with the non-parametric,
which proves the effectiveness of the HDA-DeepSC in data
fully factorized density model. Moreover, we have proposed
security.
the diffusion framework enhanced signal detection, named
DiffSDNet, by multiple denoising steps to reduce the noise
G. Computational Complexity level at the low SNR regimes, in which we customized the
The proposed HDA-DeepSC adopts the Swin-Transformer variance schedule and sampling algorithm for wireless com-
as the semantic codec, in which the window multi-head munication environments. The numerical results have proved
self-attention (W-MSA) module has high computational com- the effectiveness of DiffSDNet in denoising and demonstrated
plexity. The computational complexity of W-MSA is O(N the superiority of HDA-DeepSC in terms of robustness, trans-
h w (4C2+2M2C)),inwhichN,C,andM arethenumb × er missionrate,anddatasecurity,especiallyinlowSNRregimes.
of × lay × ers, the width of the layer, and the number of patches, Therefore,theproposedHDAsemanticcommunicationframe-
respectively. The channel codec consists of several dense workshowsgreatpromiseasacandidateforthenewsemantic
layers,thecomputationalcomplexityofwhichisalsolinearin communication paradigm, offering significant potential for
thenumberofpixels.Therefore,thecomputationalcomplexity real-world implementations.
of the proposed HDA-DeepSC is linear encoding/decoding
time in the number of pixels. To complete our discussion APPENDIXA
of computational complexity, we have measured the average DERIVATIONOF(15)
running time per image which is shown in Table IV. We can
Assume the x , i = 1,2, ,N follows the N i.i.d. Gaus-
i
observe that the running time of HDA-DeepSC on the CPU ···
siansources(variables)withzeromeanandvarianceσ normal
i
is slightly slower than that of BPG on the CPU. However,
distribution,thenthediscreteentropyofx=[x ,x , ,x ]
1 2 N
the GPU can significantly accelerate the running time of ···
can be written as
HDA-DeepSC, which means it can effectively support some
delay-sensitive applications. H(x)= E x p(x) [log 2 p(x)]= E x p(x) log 2 i Π = N 1 p(x i )
(cid:20) (cid:21)
H. Discussion of Hardware Implementation N x 2
= E log 2πσ2 i
It is possible nowadays to implement analog systems with
i=1
xi p(xi)
(cid:20)
2 i 2σ
i
2
(cid:21)
high-precision digital circuits, called pseudo-analog transmis- (cid:88) (cid:0) (cid:1)
N x 2 N
sion. For example, the pseudo-analog system SoftCast [40] = E i log 2πσ2 . (35)
does not adopt the conventional constellations but modulates
i=1
xi p(xi)
(cid:20)
2σ
i
2
(cid:21)
i=1
2 i
thenormalized2DdiscreteFouriercoefficientstothetransmit- (cid:88) (cid:88) (cid:0) (cid:1)
With the (35), we can derive the following relationship,
ted symbols directly. There are a lot of follow-up efforts, and
someofthem[40],[41]havebeenvalidatedonsoftwareradio z 2 ˜z2
p
(O
la
F
tf
D
o
M
rm
)
s
. T
w
h
i
e
th
ref
o
o
r
r
t
e
h
,
og
it
on
is
al
fe
f
a
r
s
e
i
q
b
u
le
en
t
c
o
y
a
d
c
i
h
v
i
i
e
s
v
io
e
n
hy
m
b
u
ri
l
d
tip
a
l
n
ex
a
i
lo
n
g
g H(z) H(˜z)=
i=1(cid:18)
E zi p(zi)
(cid:20)
2σ i
i
2
(cid:21)
E ˜zi p(˜zi)
(cid:20)
2σ˜ i
i
2
(cid:21)(cid:19)
(cid:88)
and digital transmission on one hardware platform. For low- log σ i 2
precision digital circuits, we can employ the quantization 2 σ˜2
i=1 (cid:18) i (cid:19)
(cid:88)
---PAGE BREAK---
XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2491
1
> E z 2 E ˜z2 [14] Y. Bo, Y. Duan, S. Shao, and M. Tao, “Joint coding-modulation for
2σ zi p(zi) i ˜zi p(˜zi) i digital semantic communications via variational autoencoder,” IEEE
i=1
(cid:88)(cid:0) (cid:2) (cid:3) (cid:2) (cid:3)(cid:1) Trans.Commun.,vol.72,no.9,pp.56265640,Sep.2024.
σ2
log i , (36) [15] Y.He,G.Yu,andY.Cai,“Rate-adaptivecodingmechanismforsemantic
2 σ˜2 communicationswithmulti-modaldata,”IEEETrans.Commun.,vol.72,
(cid:88) i=1 (cid:18) i (cid:19) no.3,pp.13851400,Mar.2024.
where σ and σ˜ are the variance of z and˜z , respectively. σ [16] L. Guo, W. Chen, Y. Sun, and B. Ai, “Device-edge digital semantic
i i i i
communication with trained non-linear quantization,” in Proc. IEEE
is the maximum value between σ and σ˜ .
i i 97thVeh.Technol.Conf.(VTC-Spring),Jun.2023,pp.15.
σ2
We can observe the second term of (36), i.e., i, is the [17] C. Liu, C. Guo, Y. Yang, W. Ni, and T. Q. S. Quek, “OFDM-based
σ˜2
constant. Especially, when z˜ is close to z, the seco i nd term digital semantic communication with importance awareness,” 2024,
arXiv:2401.02178.
will be zero. Therefore, we can drop the second term during
[18] Q.Fuetal.,“Vectorquantizedsemanticcommunicationsystem,”IEEE
training and only consider the first term of (36). With the WirelessCommun.Lett.,vol.12,no.6,pp.982986,Jun.2023.
Monte Carlo method, the entropy can be written as [19] Q.Hu,G.Zhang,Z.Qin,Y.Cai,G.Yu,andG.Y.Li,“Robustsemantic
communicationswithmaskedVQ-VAEenabledcodebook,”IEEETrans.
H(z) H(˜z) z2 ˜z2. (37) WirelessCommun.,vol.22,no.12,pp.87078722,Dec.2023.
[20] H. Gao, G. Yu, and Y. Cai, “Adaptive modulation and retransmission
Consideringthecomputationandtrainingcomplexity,werelax scheme for semantic communication systems,” IEEE Trans. Cognit.
Commun.Netw.,vol.10,no.1,pp.150163,Feb.2024.
the (37) to the subtraction between z and z˜, which is the (15)
[21] J. Huang, K. Yuan, C. Huang, and K. Huang, “D2-JSCC: Digital
as follows.
deepjointsource-channelcodingforsemanticcommunications,”2024,
arXiv:2403.07338.
[22] U. Mittal and N. Phamdo, “Hybrid digital-analog (HDA) joint source-
APPENDIXB
channel codes for broadcasting and robust communications,” IEEE
VISUALIZEDRESULTS Trans.Inf.Theory,vol.48,no.5,pp.10821102,May2002.
[23] T.Fujihashi,T.Koike-Akino,andT.Watanabe,“Softdelivery:Survey
InFig.9(a)-(c),wecanobservetheproposedHDA-DeepSC
onanewparadigmforwirelessandmobilemultimediastreaming,”ACM
can restore more details, e.g., the mouth and feathers of Comput.Surv.,vol.56,no.2,pp.137,Sep.2023.
the parrot, than the BPG with LDPC and 16QAM due to [24] M. Skoglund, N. Phamdo, and F. Alajaji, “Hybrid digitalanalog
delivering essential semantic information accurately by the sourcechannel coding for bandwidth compression/expansion,” IEEE
Trans.Inf.Theory,vol.52,no.8,pp.37573763,Aug.2006.
digital transmitter.
[25] M. Ru¨ngeler, J. Bunte, and P. Vary, “Design and evaluation of hybrid
digital-analog transmission outperforming purely digital concepts,”
REFERENCES
IEEETrans.Commun.,vol.62,no.11,pp.39833996,Nov.2014.
[26] E.Ko¨kenandE.Tuncel,“Onrobustnessofhybriddigital/analogsource-
[1] H.Xie,Z.Qin,Z.Han,andK.B.Letaief,“Hybriddigital-analogjoint channel coding with bandwidth mismatch,” IEEE Trans. Inf. Theory,
semantic-channelcodingforimagetransmission,”inProc.IEEEGlobal vol.61,no.9,pp.49684983,Sep.2015.
Commun.Conf.,CapeTown,SouthAfrica,Dec.2024,pp.16. [27] T.Fujihashi,T.Koike-Akino,T.Watanabe,andP.V.Orlik,“HoloCast+:
[2] C.-X. Wang et al., “On the road to 6G: Visions, requirements, key Hybrid digital-analog transmission for graceful point cloud deliv-
technologies, and testbeds,” IEEE Commun. Surveys Tuts., vol.25, ery with graph Fourier transform,” IEEE Trans. Multimedia, vol.24,
no.2,pp.905974,2ndQuart.,2023. pp.21792191,2022.
[3] Z.Qin,X.Tao,J.Lu,W.Tong,andG.YeLi,“Semanticcommunica- [28] J. A. Hart, The Economics, Technology and Content of Digital TV.
tions:Principlesandchallenges,”2021,arXiv:2201.01389. Boston,MA,USA:Springer,2004.
[4] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled [29] L.Yu,H.Li,andW.Li,“Wirelessscalablevideocodingusingahybrid
semantic communication systems,” IEEE Trans. Signal Process., digital-analog scheme,” IEEE Trans. Circuits Syst. Video Technol.,
vol.69,pp.26632675,2021. vol.24,no.2,pp.331345,Feb.2014.
[5] P. Yi, Y. Cao, X. Kang, and Y.-C. Liang, “Deep learning-empowered [30] C.Lan,C.Luo,W.Zeng,andF.Wu,“Apracticalhybriddigital-analog
semanticcommunicationsystemswithasharedknowledgebase,”IEEE scheme for wireless video transmission,” IEEE Trans. Circuits Syst.
Trans.WirelessCommun.,vol.23,no.6,pp.61746187,Jun.2024. VideoTechnol.,vol.28,no.7,pp.16341647,Jul.2018.
[6] Z. Weng, Z. Qin, X. Tao, C. Pan, G. Liu, and G. Y. Li, “Deep [31] B. Tan, J. Wu, R. Wang, W. Luo, and J. Liu, “An optimal resource
learning enabled semantic communications with speech recogni- allocationforhybriddigitalanalogwithcombinedmultiplexing,”IEEE
tion and synthesis,” IEEE Trans. Wireless Commun., vol.22, no.9, InternetThingsJ.,vol.6,no.1,pp.11251135,Feb.2019.
pp.62276240,Sep.2023.
[32] P. Yahampath, “Video coding for OFDM systems with imperfect CSI:
[7] E. Grassucci, C. Marinoni, A. Rodriguez, and D. Comminiello, A hybrid digitalanalog approach,” Signal Process., Image Commun.,
“Diffusion models for audio semantic communication,” in Proc. IEEE
vol.87,Sep.2020,Art.no.115903.
Int. Conf. Acoust., Speech Signal Process. (ICASSP), Seoul, South
[33] Z.Liuetal.,“Swintransformer:Hierarchicalvisiontransformerusing
Korea,Apr.2024,p.13.
shiftedwindows,”inProc.IEEE/CVFInt.Conf.Comput.Vis.(ICCV),
[8] T. Han, Q. Yang, Z. Shi, S. He, and Z. Zhang, “Semantic-preserved
Oct.2021,pp.999210002.
communicationsystemforhighlyefficientspeechtransmission,”IEEE
[34] J.Balle,D.Minnen,S.Singh,S.J.Hwang,andN.Johnston,“Variational
J.Sel.AreasCommun.,vol.41,no.1,pp.245259,Jan.2023.
imagecompressionwithascalehyperprior,”inProc.Int.Conf.Learn.
[9] J. Dai et al., “Nonlinear transform source-channel coding for seman-
Represent.,Vancouver,BC,Canada,Apr.2018.
tic communications,” IEEE J. Sel. Areas Commun., vol.40, no.8,
[35] H. Xie, Z. Qin, X. Tao, and K. B. Letaief, “Task-oriented multi-user
pp.23002316,Aug.2022.
semanticcommunications,”IEEEJ.Sel.AreasCommun.,vol.40,no.9,
[10] G.Zhang,Q.Hu,Z.Qin,Y.Cai,G.Yu,andX.Tao,“Aunifiedmulti-
tasksemanticcommunicationsystemformultimodaldata,”IEEETrans. pp.25842597,Sep.2022.
Commun.,vol.72,no.7,pp.41014116,Jul.2024. [36] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilis-
[11] H. Wu, Y. Shao, E. Ozfatura, K. Mikolajczyk, and D. Gu¨ndu¨z, tic models,” in Proc. Adv. Neural Inf. Process. Syst., Dec. 2020,
“Transformer-aided wireless image transmission with channel pp.68406851.
feedback,” IEEE Trans. Wireless Commun., vol.23, no.9, [37] J.Song,C.Meng,andS.Ermon,“Denoisingdiffusionimplicitmodels,”
pp.1190411919,Sep.2024. inProc.Int.Conf.Learn.Represent.,May2021.
[12] S. Wang et al., “Wireless deep video semantic transmission,” IEEE [38] A. Ignatov et al., “PIRM challenge on perceptual image enhancement
J.Sel.AreasCommun.,vol.41,no.1,pp.214229,Jan.2023. onsmartphones:Report,”inProc.Eur.Conf.Comput.Vis.,Jan.2019,
[13] T.-Y.Tung,D.B.Kurka,M.Jankowski,andD.Gu¨ndu¨z,“DeepJSCC- pp.315333.
Q: Constellation constrained deep joint source-channel coding,” IEEE [39] L.Chi,B.Jiang,andY.Mu,“FastFourierconvolution,”inProc.Adv.
J.Sel.AreasInf.Theory,vol.3,no.4,pp.720731,Dec.2022. NeuralInf.Process.Syst.,Dec.2020,pp.44794488.
---PAGE BREAK---
2492 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025
[40] S.JakubczakandD.Katabi,“SoftCast:One-size-fits-allwirelessvideo,” Rebecca Moores Professor with the Electrical and Computer Engineering
in Proc. ACM SIGCOMM Conf., New York, NY, USA, Aug. 2010, Department and the Computer Science Department, University of Houston,
pp.449450. Houston, TX, USA. His main research targets on the novel game-theory-
[41] X. L. Liu, W. Hu, Q. Pu, F. Wu, and Y. Zhang, “ParCast: Soft video relatedconceptscriticaltoenablingefficientanddistributiveuseofwireless
delivery in MIMO-OFDM WLANs,” in Proc. 18th Annu. Int. Conf. networks with limited resources, wireless resource allocation and manage-
MobileComput.Netw.,Istanbul,Turkey,Aug.2012,pp.233244. ment, wireless communications and networking, quantum computing, data
science, smart grids, carbon neutralization, and security and privacy. He
received a NSF Career Award in 2010, the Fred W. Ellersick Prize of the
IEEECommunicationSocietyin2011,theBestPaperAwardoftheEURASIP
Journal on Advances in Signal Processing in 2015, the IEEE Leonard G.
Abraham Prize in the field of Communications Systems (Best Paper Award
in IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS) in 2016,
theIEEEVehicularTechnologySociety2022BestLandTransportationPaper
Award,andseveralbestpaperawardsinIEEEconferences.HewasanIEEE
Huiqiang Xie (Member, IEEE) received the B.S. Communications Society Distinguished Lecturer from 2015 to 2018 and an
degreefromNorthwesternPolytechnicalUniversity, ACM Distinguished Speaker from 2022 to 2025. He has been an AAAS
theM.S.degreefromChongqingUniversity,andthe Fellow since 2019 and an ACM Fellow since 2024. He has been a 1%
Ph.D. degree from the Queen Mary University of Highly Cited Researcher since 2017 according to Web of Science. He is
Londonin2023.From2023to2024,hewasaPost- also the Winner of the 2021 IEEE Kiyo Tomiyasu Award (an IEEE Field
Doctoral Research Associate with The Hong Kong Award), for outstanding early to mid-career contributions to technologies
University of Science and Technology, Guangzhou holding the promise of innovative applications, with the following citation:
Campus.HeiscurrentlyanAssociateProfessorwith Forcontributionstogametheoryanddistributedmanagementofautonomous
Jinan University. He received the 2023 IEEE ICC communicationnetworks.
StudentTravelGrant,the2023IEEEICCBestPaper
Award,andthe2023IEEESignalProcessingSociety
Best Paper Award. He was also the Organizing Committee Co-Chair of
2024 EIECT. He is an Associate Editor of Journal of Communications and
Networks.
Zhijin Qin (Senior Member, IEEE) is currently
an Associate Professor with Tsinghua University,
Beijing, China. She was with the Imperial College
London, London, U.K.; Lancaster University, Lan-
KhaledB.Letaief(Fellow,IEEE)receivedtheB.S.
caster,U.K.;andQueenMaryUniversityofLondon,
degree(Hons.)inelectricalengineeringfromPurdue
London, from 2016 to 2022. Her research interests
University at West Lafayette, IN, USA, in Decem-
includesemanticcommunicationsandsparsesignal
ber 1984, the M.S. and Ph.D. degrees in electrical
processing. She was a recipient of the 2017 IEEE
engineering from Purdue University, in 1986, and
GLOBECOM Best Paper Award, 2018 IEEE Sig-
1990, respectively, and the Ph.D. Honoris Causa
nal Processing Society Young Author Best Paper
degree from the University of Johannesburg, South
Award,2021IEEECommunicationsSocietySignal
Africa,in2022.Heisaninternationallyrecognized
ProcessingforCommunicationsCommitteeEarlyAchievementAward,2022
leaderinwirelesscommunicationsandnetworks.He
IEEE Communications Society Fred W. Ellersick Prize, and 2023 IEEE
isamemberofUnitedStatesNationalAcademyof
ICC Best Paper Award. She was a Guest Editor of IEEE JOURNAL ON
Engineering, a fellow of Hong Kong Institution of
SELECTEDAREASINCOMMUNICATIONS(JSAC)SpecialIssueonSemantic
Engineers,amemberofIndiaNationalAcademyofSciences,andamember
Communications and an Area Editor of IEEE JOURNAL ON SELECTED
ofHongKongAcademyofEngineeringSciences.Heisalsorecognizedby
AREAS IN COMMUNICATIONS Series. She was also the Symposium Co-
ThomsonReutersasanISIHighlyCitedResearcherandwaslistedamongthe
Chair of IEEE GLOBECOM 2020 and 2021. She is an Associate Editor of
2020top30ofAI2000InternetofThingsMostInfluentialScholars.Hewasa
IEEE TRANSACTIONS ON COMMUNICATIONS, IEEE TRANSACTIONS ON
recipientofmanydistinguishedawardsandhonors,includingthe2022IEEE
COGNITIVENETWORKING,andIEEECOMMUNICATIONSLETTERS.
Communications Society Edwin Howard Armstrong Achievement Award,
2021 IEEE Communications Society Best Survey Paper Award, 2019 IEEE
CommunicationsSocietyandInformationTheorySocietyJointPaperAward,
and 2016 IEEE Marconi Prize Paper Award in Wireless Communications.
He has also been a dedicated teacher committed to excellence in teaching
and scholarship. He received the Michael G. Gale Medal for Distinguished
Teaching(highestuniversity-wideteachingawardandonlyonerecipient/year
is honored for his/her contributions). Since 1993, he has been with The
Hong Kong University of Science and Technology (HKUST), where he
Zhu Han (Fellow, IEEE) received the B.S. degree has held many administrative positions, including the Acting Provost, the
inelectronicengineeringfromTsinghuaUniversity, Head of the Electronic and Computer Engineering Department, and the
Beijing, China, in 1997, and the M.S. and Ph.D. DirectorofHongKongTelecomInstituteofInformationTechnology.While
degreesinelectricalandcomputerengineeringfrom at HKUST, he was the Chair Professor and the Dean of Engineering. He
theUniversityofMarylandatCollegePark,College is well recognized for his dedicated service to professional societies and
Park, MD, USA, in 1999 and 2003, respectively. IEEE, where he has served in many leadership positions. These include
From2000to2002,hewasaResearchandDevelop- the Founding Editor-in-Chief of the prestigious IEEE TRANSACTIONS ON
mentEngineerwithJDSU,Germantown,MD,USA. WIRELESS COMMUNICATIONS. He also served as the President of the
From 2003 to 2006, he was a Research Associate IEEECommunicationsSociety(20182019),theworldsleadingorganization
with the University of Maryland at College Park. for communications professionals with headquarters in New York City and
From 2006 to 2008, he was an Assistant Professor membersin162countries.HealsoservedasamemberfortheIEEEBoard
with Boise State University, Boise, ID, USA. He is currently a John and ofDirectors.