2478 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Hybrid Digital-Analog Semantic Communications Huiqiang Xie , Member, IEEE, Zhijin Qin , Senior Member, IEEE, Zhu Han , Fellow, IEEE, and Khaled B. Letaief , Fellow, IEEE Abstract—Digital and analog semantic communications (Sem- andubiquitousconnectedintelligence.Tomeetthesedemands, Com) face inherent limitations such as data security concerns target key performance indicators [2] have been proposed, in analog SemCom, as well as leveling-off and cliff-edge effects aiming to ensure the seamless integration of these advanced in digital SemCom. In order to overcome these challenges, technologies in the next generation of mobile communication we propose a novel SemCom framework and a corresponding systems, e.g., 107 devices/km2 for connectivity, 60 b/s/Hz system called HDA-DeepSC, which leverages a hybrid digital- analog approach for multimedia transmission. This is achieved for spectral efficiency, and 100 us for end-to-end latency. through the introduction of analog-digital allocation and fusion To materialize this vision, semantic communications [3] have modules. To strike a balance between data rate and distortion, been envisioned as one of the potential technologies due to wedesignnewlossfunctionsthattakeintoaccountlong-distance the low semantic errors, high spectral efficiency, and high dependencies in the semantic distortion constraint, essential transmission rates. By exchanging semantic information at information recovery in the channel distortion constraint, and both ends, semantic communications can reconstruct sources optimal bit stream generation in the rate constraint. Addi- tionally, we propose denoising diffusion-based signal detection or directly perform tasks with the tolerance of transmission techniques, which involve carefully designed variance schedules errors. According to the communication paradigm, seman- and sampling algorithms to refine transmitted signals. Through tic communications can be categorized into two categories: extensivenumericalexperiments,wewilldemonstratethatHDA- analog semantic communications and digital semantic com- DeepSC exhibits robustness to channel variations and is capable munications. of supporting various communication scenarios. Our proposed Analog semantic communications [4], [5], [6], [7], [8], framework outperforms existing benchmarks in terms of peak signal-to-noise ratio and multi-scale structural similarity, show- [9], [10], [11], [12] convey the semantic information using casing its superiority in semantic communication quality. continuous signals, which takes advantage of deep learning Index Terms—Semantic communications, multimedia trans- (DL) to design end-to-end systems and maps the source to mission,analogcommunications,digitalcommunications,hybrid the non-fixed-size constellations directly. There exist many digital-analog communications. works for different modal data transmission. Xie et al. [4] have developed a DL based semantic communication system, I. INTRODUCTION named DeepSC, for text transmission, in which the sentences AS MOBILE communication systems transition from the are mapped to the embedding vectors and then transformed fifth generation (5G) to the sixth generation (6G), there to the learned non-fixed-size constellation points. Yi et al. is a need to address the evolving requirements of seamlessly [5] introduced the explicit knowledge base to the DeepSC as integrating virtual/augmented reality, remote control robots, the side information and integrated the knowledge base into the end-to-end optimization, achieving the higher bilingual Received 15 May 2024; revised 16 December 2024; accepted 15 January 2025. Date of publication 10 April 2025; date of current version 19 June evaluationunderstudy(BLEU)scoreatthelowsignal-to-noise 2025. This work was supported in part by the National Key Research and ratio (SNR) regions. Weng et al. [6] have proposed an end-to- Development Program of China under Grant 2023YFB2904300; in part by end semantic communication system for speech recognition the National Natural Science Foundation of China (NSFC) under Grant 62401227 and Grant 62293484; in part by Guangzhou Municipal Science and speech synthesis tasks, named DeepSC-ST. The speech andTechnologyProjectunderGrant2025A04J3380;inpartbyFundamental signals are processed by the DeepSC-ST and output the con- Research Funds for the Central Universities under Grant 21624349; in part tinuous constellation points at the transmitter. Grassucci et al. by the Hong Kong Research Grants Council under the Areas of Excellence [7]havedesignedagenerativeaudiosemanticcommunication Scheme under Grant AoE/E-601/22-R; in part by NSF ECCS-2302469, Toyota;inpartbyAmazon;andinpartbytheJapanScienceandTechnology framework,whichtransmitsthecontinuousembeddingvectors Agency (JST) Adopting Sustainable Partnerships for Innovative Research togeneratetheaudiosatthereceiver.Daietal.[9]haveinves- Ecosystem (ASPIRE) under Grant JPMJAP2326. An earlier version of this tigated the end-to-end image transmission problem, in which paper was presented in part at the IEEE Globecom Workshop 2024 [1]. (Correspondingauthor:ZhijinQin.) the image is non-linearly transformed into continuous signals Huiqiang Xie is with the College of Information Science and with different lengths. Wu et al. [11] have investigated the Technology, Jinan University, Guangzhou 510632, China (e-mail: end-to-end image transmission for multiple-inputs multiple- huiqiangxie@jnu.edu.cn). Zhijin Qin is with the Department of Electronic Engineering, Tsinghua outputs(MIMO)channels.Similarly,theimagesareconverted University, Beijing 100084, China, also with the State Key Laboratory of into continuous semantic features and adaptively assigned to Space Network and Communications, Beijing 100084, China, and also with different subchannels based on the channel state information Beijing National Research Center for Information Science and Technology, Beijing100084,China(e-mail:qinzhijin@tsinghua.edu.cn). (CSI). Wang et al. [12] have proposed a video semantic ZhuHaniswiththeDepartmentofElectricalandComputerEngineering, communication system, in which the semantic features of UniversityofHouston,Houston,TX77004USA,andalsowiththeDepart- frames are extracted into continuous signals and transmitted ment of Computer Science and Engineering, Kyung Hee University, Seoul using analog communication methods. 446-701,SouthKorea(e-mail:hanzhu22@gmail.com). Khaled B. Letaief is with the Department of Electronic and Computer The continuous signals in analog semantic communications Engineering, The Hong Kong University of Science and Technology, Hong have two benefits. One is to allow gradient propagation and Kong,China(e-mail:eekhaled@ust.hk). enable end-to-end optimization. The other is that the contin- DigitalObjectIdentifier10.1109/JSAC.2025.3559149 ©2025TheAuthors.ThisworkislicensedunderaCreativeCommonsAttribution4.0License. Formoreinformation,seehttps://creativecommons.org/licenses/by/4.0/ ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2479 uous signals have a high degree of freedom that provides the Q1: Howtoenhancedatasecurityandalleviatetheleveling- smoothness performance optimization varying from channel off and cliff-edge effects? conditions,enablingbetterrobustnessinthelowSNRregimes. Q2: How it be compatible with purely analog and digital However, continuous signals also have flaws. The commer- semantic communication? cial encryption algorithms are designed for discrete signals, Q3: How to support the various communication environ- e.g., bit streams, raising concerns about the data security of ments, e.g., the wide bandwidth scenario or the weak continuous signal-based systems. Besides, in some scenarios communication scenario? that require accurate transmission at the bit level, analog The concept of hybrid digital-analog (HDA) joint source- semantic communications cannot meet the requirement due to channel codes [22] was proposed by Mittal et al. in 2002, theapproximatelyinfinitecandidatesetsincontinuoussignals. which proves that HDA codes are capable of theoretically Therefore,digitalsemanticcommunicationshaveattractedthe achievingtheShannonlimit(theoreticallyoptimumdistortion) attention of researchers. and a less severe leveling-off and cliff-edge effects. Since Digitalsemanticcommunications[13],[14],[15],[16],[17], then, the HDA codes have attracted much attention from [18],[19],[20],[21]transmitsemanticinformationinthetype academicsandindustries[23],[24],[25],[26],[27].Skoglund of discrete signal, which maps the source to bit streams or et al. [24] have proposed HDA codes for the bandwidth fixed-size constellations. Tung et al. [13] have proposed the compression scenarios, and Ko¨ken et al. [26] have analyzed quantizedjointsource-channelcodingforimagetransmission, therobustnessofHDAcodeswithbandwidthmismatch.HDA namedDeepJSCC-Q,bymappingthecontinuoussignalstothe transmission is also adopted in the Japanese and Canadian close points in the fixed-size constellations to be compatible television signal transmission [28], where video and speech with some protocols. Similarly, Bo et al. [14] improved the signals are transmitted by analog and digital transceivers, quantized joint source-channel coding by learning transition respectively. Yu et al. [29] have designed the HDA joint probabilityfromsourcedatatodiscreteconstellationsymbols, source-channel coding for scalable video transmission, named inwhichtheGumbel-Maxsamplingisemployedtosamplethe WSVC. which takes the 2D discrete wavelet transform for constellation points from the learned transition probability so analog transmission and H.264/AVC for digital transmission. that avoiding the non-differentiable quantization. Guo et al. Lan et al. [30] have formulated the video transmission distor- [16] quantized the semantic information with the learnable tionsfirstandthenproposedasub-optimalresourceallocation non-linear scalar quantizer, which learns to adopt dynamic scheme, which allocates the power and quantization bits. Tan quantizationlevelsfordifferentsemanticvalues.Fuetal.[18] etal.[31]haveproposedtheoptimalresourceallocationforthe have proposed the vector quantized semantic communication Internet-of-things (IoT) scenario. Three factors are optimized system, in which the semantic vectors are quantized into toenhancethequalityoftherecoveredimage,includingdigital bit streams with the learnable vector quantizer and trans- bandwidth,orthogonalpower,andnonorthogonalpowerofthe mitted with the digital channel codings and modulations. analog signal. Yahampath [32] has considered the imperfect Gao et al. [20] have developed an adaptive modulation and channel state information (CSI) for the video transmission, in retransmission scheme by deriving the relationship between which the digital power is allocated by considering the CSI bit-error-rate and the task performance, in which the seman- errors, and the remaining power is used to transmit superim- tic information is quantized into fixed-length bit streams. posed analog QAM symbols. However, these works rely on Huang et al. [21] have proposed an iterative training algo- linear transforms and ignore the semantic information behind rithm for digital semantic communications, in which the deep data,whichisunsuitablefornon-linearsemantictransmission. source codec are trained according to the chosen channel Inspired by the concept of HDA codes, we propose a novel coding rate. framework called DL-based HDA semantic communication. The above works on digital semantic communication This framework integrates the strengths of both analog and achieve accurate transmission at the bit or symbol level and digital semantic communications to effectively tackle the part of the works can apply the encryption algorithms to challenges mentioned earlier. Firstly, the HDA semantic com- encrypt the bit streams. However, digital semantic commu- municationsystemscanimprovedatasecurityandalleviatethe nication systems introduce unavoidable quantization errors leveling-off and cliff-edge effects by transmitting part infor- due to the process of quantizing continuous signals to dis- mation with the continuous signals in analog communications crete signals, which introduces the leveling-off effect. That is, (Q1). Besides, analog and digital semantic communications the quality of the decoded source signal is limited because are special cases of HDA semantic communications. By con- of the quantization errors. Besides, digital semantic com- trolling the ratio between analog and digital components, the munications experience the cliff-edge effect varying from HDA semantic communications not only can be transformed different channel conditions, which usually results in a drastic into purely analog or digital semantic communications (Q2) degradation in performance at lower SNRs. Therefore, it is but also support the different communication scenarios (Q3). imperative to adopt a new semantic communication paradigm The main contributions are summarized as follows: that can address the limitations of both analog and digi- A novel HDA semantic communication framework is tal semantic communications. This paradigm should enhance • proposed, which takes advantage of analog and digital data security and mitigate the leveling-off and cliff-edge semantic communications and addresses the limitations effects. However, designing such a semantic communication inherent in each. system poses several challenges that need to be overcome, Based on the HDA semantic communication framework, namely, • we propose an HDA semantic communication system, ---PAGE BREAK--- 2480 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Fig.1. Theproposedhybriddigital-analogsemanticcommunicationframework. named HDA-DeepSC, for multimedia transmission, in Given an image, I R3 H W, where H and W are the × × ∈ which the new analog-digital allocation and fusion mod- height and width of the image. The semantic information can ules are proposed to generate the analog and digital be extracted by components.Besides,thenewlossfunctionsaredesigned z= (I;α ), (1) t S tocapturethelocalandglobalinformation,alleviatingthe where z RM 1 is the semantic information and (;α ) is distortions from channels, and balancing the source rate. ∈ × S · t denotedasthesemanticencoderwiththeparameterα .Then, To further improve the quality of the recovered images, t • z is split into two parts with analog-digital allocation module we proposed a diffusion-based framework enhanced sig- by nal detection by designing the variance schedule and [z ,z ]= (z;θ ), (2) sampling algorithm. A D A t Based on extensive simulation results, the proposed wherez andz arethesemanticinformationtransmittedby • A D HDA-DeepSC outperforms the conventional and DL- the analog transmitter and the digital transmitter, respectively. based communication systems and improves the system (;θ ) is analog-digital allocation with parameters θ . t t A · robustness at the low SNR regime. 1) Analog Transmitter: The encoded symbols for analog The rest of this paper is organized as follows. The sys- semantic transmission are represented as tem model is introduced in Section II. The HDA semantic x = (z ;β ), (3) transmission is proposed in Section III. Section IV details the A C A A t proposed diffusion-based signal detection. Numerical results where x A CLA× 1 is the encoded complex symbols and ∈ are presented in Section V to show the performance of (;β ) is denoted as the analog channel encoder with the C A · t the proposed frameworks. Finally, Section VI concludes this parameter β . t paper. 2) Digital Transmitter: The entropy coding and quantizer Notation: Bold-font variables denote matrices or vectors. will be employed firstly to convert z into bit streams by D Cn m and Rn m represent complex and real matrices of size × × n m, respectively. (µ,σ2) means circularly-symmetric b= E ( Q (z D )), (4) × CN complex Gaussian distribution with mean µ and covariance where b is the bit streams, () and () are denoted as σ2. (µ,σ2) means Gaussian distribution with mean µ and the quantizer and entropy enc Q od · er, resp E ec · tively. Then, b is N covariance σ2. (a,b) means continuous uniform distribution encodedwithdigitalchannelencoders(e.g.,LDPCcodes)and U between a and b. () ∗ denotes the conjugate operation. x[k] fixed-size constellations (e.g., 16-QAM) by · represents the k-th element in the vector. x = ( (b)), (5) D D M C II. SYSTEMMODEL wherex D CLD× 1 istheencodedsymbols, ()represents ∈ M · the fixed-size modulation, and () is denoted as the digital D AsshowninFig.1,weconsiderasingle-inputsingle-output C · channel encoder. (SISO)communicationsystem,whichaimstosendmultimedia With the analog and digital symbols, the transmitted sym- overtheair.TheproposedHDASemComframeworkconsists bols are x = [x ,x ] CL 1, where L = L +L . The A D × A D of the HDA transmitter, the wireless channel model, and the bandwidth compression ∈ ratio is defined as η = L . HDA receiver, which employs both digital semantic transmis- 3 × H × W sion and analog semantic transmission. B. Wireless Channel Model When x is transmitted over the block fading channels, the A. The Hybrid Digital-Analog Transmitter received signal can be given by The HDA transmitter consists of a semantic encoder that y =hx+n, (6) extracts the semantic information behind images, analog- digital allocation that allocates the semantic information for wherehisthechannelcoefficientthatremainsconstantwithin analog and digital transmission, and channel encoders that a channel coherence time, n is the additive white Gaussian protect the information over the air. noise(AWGN),inwhichn 0,σ2I .FortheRayleigh ∼CN n L (cid:0) (cid:1) ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2481 fadingchannel,thechannelcoefficientfollowsh (0,1); A. Model Design ∼CN for the Rician fading channel, it follows h µ ,σ2 ∼CN h h TheproposedHDA-DeepSCisshowninFig.2.Thedesign with µ h = r/(r+1) and σ h = 1/(r+1), where (cid:0) r is th (cid:1) e of each module is detailed below. Rician coefficient. The SNR is defined as E( x 2)/E( n 2). 1) Semantic Codec: The semantic encoder comprises a (cid:112) (cid:112) (cid:107) (cid:107) (cid:107) (cid:107) convolutional layer and a residual Swin Transformer block. C. The Hybrid Digital-Analog Receiver The first convolutional layer projects the images into vector- shaped tokens, which are used as inputs to the residual Swin The receiver comprises signal detection that estimates the Transformer block in a permutation-invariant manner. Then, transmittedsymbols,aanalog-digitalfusionmodulethatfuses the residual Swin Transformer block consists of several Swin the digital and analog semantic information, channel decoders Transformerlayersandaconvolutionlayer,inwhichtheSwin that alleviate the distortions from the wireless channels, and a Transformer layer [33] originates from the Transformer and semantic decoder that recovers the images with the received introduces the local attention and shifted window mechanism semantic information. to improve the visual semantic understanding. Besides, a con- Withtheleastsquares(LS)signaldetection,thetransmitted volutional layer with spatially invariant filters in the residual symbols can be estimated by block can enhance the translational equivariance. The residual h ∗ h ∗ connection allows for aggregation of the shallow and deep xˆ = y =x+ n, (7) h2 h2 semantic features. | | | | Similarly,thesemanticdecoderconsistsoftheresidualSwin where xˆ = [xˆ ,xˆ ] represents the estimated symbols. We A D Transformerblock,convolutionallayers,andpixelshuffle.The assume that h is the perfect CSI. After the signal detection, residualblockistoenhancethevisualsemanticunderstanding. the semantic features are recovered by the analog and digital The residual connection provides a short connection from receivers, respectively. the semantic encoder to the semantic decoder, allowing the 1) Analog Receiver: The semantic features transmitted by processingofreconstructiontofusevaryinglevelsoffeatures. analog communications are estimated by The convolutional layers and pixel shuffle form the recon- ˆz A = CA− 1(xˆ A ;β r ), (8) s u t p ru sa c m tio p n les m t o h d e u f l e e a , tu in re w a h n i d ch pix th e e ls s h u u b f - fl p e ix r e e l al c lo o c n a v t o e l s ut t i h o e n f a e l a l t a u y re e s r where zˆ A is the estimated semantic features and CA− 1( · ;β r ) to reconstruct the transmitted images. is denoted as the analog channel decoder with parameter β r . 2) Analog-Digital Allocation and Fusion: At the trans- 2) Digital Receiver: For digital semantic transmission, the mitter, the analog-digital allocation module transforms the transmitted bit streams are recovered firstly by original semantic information into essential and auxiliary bˆ= CD− 1 M − 1(xˆ D ) , (9) s p e la m y a s n a ti n c im in p fo o r r m tan at t io ro n l . e T in he bu e il s d s i e n n g tia th l e s i e m m a a g n e t s ic an i d nf t o h r e m o at t i h o e n r where CD− 1( · ) represents th(cid:0)e digital c(cid:1)hannel decoder and parts of semantic information work to improve the quality of 1() is denoted as the fixed-size demodulation. Then, the the image. The essential part includes the basic information − M · semantic features transmitted with digital semantic transmis- about the image, e.g., the low-frequency information, and sion are recovered by needs to be delivered accurately and cryptographically. Only the essential part cannot be obtained, the image cannot be ˆz D = − 1( − 1(bˆ)), (10) built. However, the nature of analog semantic transmission is Q E where 1() and 1() are denoted as the entropy decoder continuous signals and not compatible with discrete encryp- − − E · Q · tion algorithms. Therefore, the essential part is transmitted and dequantizer, respectively. With zˆ and zˆ , the semantic features are fused by accurately by digital communication systems, in which the A D data encryption methods (e.g., symmetric cryptography and ˆz= − 1(ˆz A ,ˆz D ;θ r ), (11) asymmetric cryptography) can be applied to encrypt the bit A streams to guarantee the data security of the essential part. wherezˆ istherecoveredsemanticinformationand 1(;θ ) A − · r A hyper codec is proposed to extract the essential part of isrepresentedastheanalog-digitalfusionmodulewithparam- the original semantic information, which is given by eters θ . r Finally, the transmitted image can be reconstructed by z = (z;θ ), (13) D t H Iˆ = S − 1(ˆz;α r ), (12) where H (z;θ t ) is denoted as the hyper encoder. As shown in Fig. 2, the hyper encoder employs two convolutional layers where 1(;α )representsthesemanticdecoderwithparam- S − · r to downsample the original semantic information, such that eter α . r enables a larger receptive field and extracts the essential semantic information. III. HYBRIDDIGITAL-ANALOGSEMANTICTRANSMISSION Theauxiliaryparthelpsimprovethequalityoftherecovered In this section, we design an HDA semantic communica- image,whichistransmittedbyanalogcommunicationsystems tionsystem,namedHDA-DeepSC,forheterogeneouswireless withthefollowingbenefits.Analogcommunicationsystemsdo communication environments. Then, we develop the new loss not have a cliff effect and are suitable for optimizing systems function to train the HDA-DeepSC with the proposed training inanend-to-endmanner.Toextracttheauxiliarypart,wefirst algorithm. analyze the entropy of z conditioned on z˜, H(z z˜), which | ---PAGE BREAK--- 2482 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Fig.2. Thestructureoftheproposedhybriddigital-analogsemanticcommunicationsystem. qualifies the uncertainty about z when z˜ is known. In other The design of analog-digital allocation and fusion can also words, it measures the remaining information of z when z˜ is be viewed as a coarse-to-fine processing. The digital and known. The lower bound of H(z z˜) is derived by analog component transmits coarse and auxiliary semantic | information about the basics and supplements of the image, H(z ˜z)=H(z,˜z) H(˜z) | − respectively. The receiver fuses the coarse and auxiliary H(z) H(˜z), (14) semantic information to obtain fine semantic information, ≥ − which is used to recover the high-fidelity images. where the equals hold when z˜ is close to z. z˜ = 3) Digital Transceiver: The quantizer module rounds ele- 1 1( (z ));θ is the recovered semantic infor- H − Q − Q D r ments of z to the nearest integer, z˜ . Then, the arithmetic mation based on essential part without consideration of D D (cid:0) (cid:1) coding converts z˜ into bit streams, in which the arithmetic transmission errors. 1(;θ ) is denoted as hyper decoder, D − r H · coding is one kind of entropy coding. The entropy coding wheretwoconvolutionallayersareemployedtoupsampleand requires the distribution of z˜ in advance. Similarly to [34], recover the basic semantic information. D we model z˜ using a non-parametric, fully factorized density Byobserving(14),wecanobtaintheremaininginformation D model by of z when z˜ is known, i.e., the auxiliary part, by 1 1 z A =z − ˜z, (15) p(˜z D | ψ)= p ˜zD[i] | ψ[i] ψ[i] ∗U −2 , 2 (˜z D [i]), where z A is transmitted by analog communications. The (cid:89) i (cid:18) (cid:18) (cid:19)(cid:19) (17) derivation is in Appendix A. At the receiver, the analog-digital fusion module is where ψ[i] is the parameters of each univariate distribution employed to obtain the fine semantic information by fusing p . Like most cases, we model the quantization errors the essential and auxiliary parts, which is given by w z˜ i D th [i] t | h ψ e [i] uniform distribution. Therefore, we convolve each ˆz= − 1(ˆz A ,ˆz D ;θ r )= − 1(ˆz D ;θ r )+ˆz A , (16) non-parametric density with a standard uniform density to A H better match the prior of z˜ . D where 1(;θ ) shares the same weights with the hyper For digital channel codec and modulation, we adopt the − r H · decoder in the transmitter. adaptive modulation and coding for different SNRs. ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2483 4) Analog Transeiver: The analog channel codec aims to p(z)logp(z)dz − compress the semantic features and transmit them effectively (cid:90) over the air. Similarly to the previous works [35], the analog = p(z)p(ˆz z)logq(z ˆz ,θ )dzdˆz +H(z) D D r D channel codec mainly employs the fully connected layers | | (cid:90) to transmit the semantic information due to global semantic =E E [logq(z ˆz ,θ )]+H(z). z ∼ p(z) ˆzD∼ p(ˆzD| z) | D r information preservation. Compared with the convolutional (20) neural network (CNN) layer to capture the local information, where the inequation follows KL[p(z zˆ ),q(z zˆ ,θ )] 0, the dense layer is good at capturing global information and | D | D r ≥ in which KL[, ] is the Kullback-Leibler (KL) divergence and preserving the entire attributes, which follows the target of · · q(z zˆ ,θ ) is the variational approximation of p(z zˆ ). the analog channel codec. This can enhance the system’s | D r | D For the sake of argument, assume for a moment that the robustness to channel noise. likelihood is given by B. Loss Function Design q(z ˆz D ,θ r )= z,(2λ z ) − 1I , (21) | N The wireless multimedia transmission problem can be (cid:16) (cid:17) viewed as the classical rate-distortion optimization problem, where z = 1(ˆz ;θ ). The log-likelihood then works out − D r which includes distortion and rate constraints. H to be the squared difference between z and z weighted by λ . z 1) Loss Function Design for Distortion Constraints: The Then, the I(z,ˆz ) can be rewritten as D distortion constraint can be categorized into semantic and (cid:98) channel distortion constraints. For semantic distortion con- I(z,ˆz ) λ E z z 2 +H(z)+constant. (22) D z straint, except for the pixel difference considered in most ≥− (cid:107) − (cid:107) (cid:104) (cid:105) works, we further introduce the frequency difference of the Submitting (22) into (19) and omitting the constant, the images. The designed loss function for semantic distortion can be written as CD L constraint is given by E[ z ˆz ]+λ E z z 2 H(z). (23) =E I Iˆ 2+λ (I) (Iˆ) , (18) L CD ≈ (cid:107) − (cid:107) z (cid:107) − (cid:107) − SD L (cid:107) − (cid:107) F|F −F | (cid:104) (cid:105) (cid:104) (cid:105) If we freeze the semantic codec during training, H(z) can be where λ is the weight and () represents the Fourier F F · technically dropped out from CD . transform. The first item in (18) refers to the pixel difference L 2) Loss Function Design for Rate Constraints: For rate of the image, we assume that the pixels of the image follow constraint, the analog transmitter designs the fixed-length the Gaussian distribution without loss of generality and thus output. Therefore, we consider the rate constraint for the employ the mean-square error (MSE) loss. The second item digital transmitter, which is given in (18) refers to the frequency difference of the image, we considerthelearningoflong-rangedependenciesoftheimage =E[ log(p(˜z ψ)))], (24) Rate D L − | and design the Fourier-based loss function. In detail, we map the images into the frequency domain and compare the where p(z˜ D ψ) is given in (17). By minimizing the rate | difference between the original and transmitted images. The constraint, we can optimize the distribution of z˜ D and reduce reasons behind the design can be summarized as the number of bits generated by the arithmetic coding. The MSE loss guides the neural networks to recover • the local pixels of the images by comparing the pixel C. Training Details difference,whichignoresthelong-rangedependenciesof The proposed training algorithm is shown in Algorithm 1. the image. We adopt three-stage training methods. The first stage is The Fourier-based loss can help the neural network learn • the long-range dependencies of the image. Because the to train the semantic codec with the L SD , which enables effectivesemanticextraction.Afterthesemanticcodecfinishes same frequency in the frequency domain refers to the training, the second stage is to train the hybrid transceiver different pixels at the different positions of the image. with +λ , which aims to reduce the distortions CD r Rate For the channel distortion constraint, we consider the L L from physical channels as well as the number of bit streams. distortions from channels and the transmission of essential We can drop out the H(z) in since we freeze the CD information. The designed loss function is given by L semantic codec during training. The non-differentiable opera- =E[ z ˆz ] I(z,ˆz ), (19) tions, e.g., the quantization, entropy coding, and modulation, CD D L (cid:107) − (cid:107) − will block the gradient back-propagation from receiver to where the first item minimizes the distortions from chan- transmitter. Therefore, we substitute additive uniform noise nels and the second item maximizes the mutual information for the non-differentiable operations itself during training, between z and zˆ to make zˆ contains more information D D i.e., z˜ = z +u in line 10 of Algorithm 1. Besides, we D D of z. However, directly optimizing the I(z,zˆ ) is hard. We D choose the error-free transmission for the z˜ due to two D derive the lower bound of I(z,zˆ ) by D factors, one is that the number of generated bit streams is p(z ˆz ) muchsmallerthantheconventionalsourcecoding,e.g.,JPEG; I(z,ˆz )= p(z,ˆz )log | D dzdˆz D D p(z) D another one is the accurate bit transmission characteristic of (cid:90) digital communication. Finally, we train the whole network ≥ p(z,ˆz D )logq(z | ˆz D ,θ r )dzdˆz D with L SD +λ r L Rate to improve the quality of the recovered (cid:90) ---PAGE BREAK--- 2484 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Algorithm 1 HDA-DeepSC Training Algorithm 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 F F F u u u n n n )( c e d o C ci t n a m e S ni a r T : n oi t c I . t e s a t a d m o rf el p m a S : t u p nI α ; I ( = z ) , t S ˆ 1 α ; z ( ) = I , − r S e t u p m h ) 8 t o i 1 w C ( , D S L (cid:127) α α , h ti w t n e c s e d t n ei d a r n G i a r T . D S t r L 1 α ; ) ( ) α ; ( d n a . : n r u t e R − t r S · · S )( r e vi e c s n a r T di r b y H ni a r T : n oi t c m e S el p m a s d n a c e d o c ci t n a m e s e z e e r F : t u p nI z . s e r u t a ef : r e t ti m s n a r T n oi t a c oll A l a ti gi D - g ol a n A / / θ ; z ( = z ) , D t H (cid:127) 1 1 ˜ , u u z = + z , , D D 2 2 U − 1 ˜ ˜ θ z ( ) ; = z , − D r M̂ H ˜z z = z . A − r e t ti m s n a r T l a ti gi D / / ˜z si d t n ei d a r g di o v a o t e t e i m rf - s r n o a r r r e T D r e t ti m s n a r T g ol a n A / / β z = x ( ) ; , A A A t C , n oi t a zil a m r o n r e w o P x .ri a t e i h m t s r n e a v r o T A : r e vi e c e R ˜ y z e vi e h d c t ) e n i 6 w R a ( . D A r e vi e c e R l a ti gi D / / ˜ ˆ z = z , D D (cid:127) 1 ˆ = θ z z ( ) ; . − D r H r e vi e c e R g ol a n A / / ˆx y b n oi t c e t e d t l a e n g g ) o 7 i S t ( , A 1 ˆ ˆ z β x = ( ) ; . − A A r A C n oi s u F l a ti gi D - g ol a n A / / (cid:127) ˆ ˆ + z z = z . A λ + e t u p m h ) ) d 4 3 t o n i 2 2 w C a ( ( . et D a C R r L L(cid:127) β β θ θ , , , h ti w t n e c s e d t n ei d a r n G i a r T t r t r λ + . et D a C R r L L 1 θ ; ( β β ; ; ( ) ( ) ) d n a , , , : n r u t e R − A t t r A · H C · · C 1 θ ; ( ) . − r · H )( k r o w t e N el o h W ni a r T : n oi t c I . t e s a t a d m o rf el p m a S : t u p nI ˆI . t e g o t 3 d n a , 8 2 - 8 , 2 s e nil t a e p e R λ + e t u p m h ) ) d 8 4 t o n i 1 2 w C a ( ( . et D a R S r L L (cid:127) α β β α θ θ , , , , n e c s e d t n ei d a r n G i a r T , t t r r t r λ + . et D a R S r L L . C S p e e D - A D H e h T : n r u t e R a p a t n p w ci t a e h ti .r Algorithm 2 HDA-DeepSC Inference Algorithm image and reduce the number of bit streams in an end-to-end manner, which converges to the global optimization. When the whole network has been trained, we can employ the model to transmit the image wirelessly. The inference algorithm is presented in Algorithm 2. We remove the addi- tive uniform noise and replace it with the non-differentiable operations. The three-stage training algorithm ensures that each stage can converge to the local optimum and avoids the mismatch of gradient descent. Besides, the approximate quantized noise 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 F u n )( e c n e r ef nI C S p e e D - A D H n oi t c I . t e s a t a d m o rf el p m a S : t u p nI : r e t ti m s n a r T α ; I ( = z ) . t S n oi t a c oll A l a ti gi D - g ol a n A / / θ ; z ( = z ) , D t H ˜ z ( = ) z , D D Q 1 1 ˜ ˜ = z θ z ; ( ( ) ) , − − D r H Q˜z z = z . A − r e t ti m s n a r T l a ti gi D / / ˜z ( = b ) , D E ( )) b = x ( . D D C M r e t ti m s n a r T g ol a n A / / β z = x ( ) ; , A A A t C , n oi t a zil a m r o N r e w o P x [ = x x ] , ti t m r s e n v a o r T D A : r e vi e c e R y h ti w e vi e c ) e 6 R ( .y b n oi t c e t e d t l a e n g g ) o 7 i S t ( r e vi e c e R l a ti gi D / /ˆ 1 1 ˆ = b x ( ) , − − D D C M ˆ 1 1 ˆ )) b = ( ( z , − − D M̂ Q E (cid:127) 1 ˆ = θ z z ( ) ; . − D r H r e vi e c e R g ol a n A / / 1 ˆ ˆ z β x = ( ) ; . − A A r A C n oi s u F l a ti gi D - g ol a n A / / (cid:127) ˆ ˆ z = z + z , A ˆ 1 ˆ α = ; z ( ) I . − r S ˆI . : n r u t e R h ˆx e A : .ri a n a d ˆx D . helps avoid the disappearing gradient, which enables end-to- end training. Moreover, the inference algorithm indicates that the digital component can adopt the encryption algorithm to protect the digital bits and the adaptive modulation coding against channel distortions. IV. DIFFUSIONFRAMEWORKENHANCED SIGNALDETECTION This section provides an overview of the de-noising dif- fusion framework and its background. Subsequently, we introduce a novel diffusion-based signal detection method called DiffSDNet. DiffSDNet is developed by incorporating a carefully designed variance schedule into the training and sampling algorithms. The diffusion-based de-noise module is the optional part of the HDA-DeepSC, which can further improve the robustness of the HDA-DeepSC. A. De-Noising Diffusion Framework Given a random noise as input, the denoising diffusion framework [36] models the generative processing through multiple de-noising steps. Each step iteratively enhances the generative results by removing the predicted noise, akin to Langevin dynamics. The de-noising diffusion framework is divided into forward process and reverse process. ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2485 1) Forward Process: The forward process is fixed to a B. The Proposed De-Noising Diffusion-Based Signal MarkovchainwithT stepsthatgraduallyaddsGaussiannoise Detection tothedataaccordingtoavariancescheduleγ , ,γ ,which 1 T The detected signals in (7) can be rewritten as ··· is given by xˆ =x+n˜, (30) v(0) → v(1) → v(2) →···→ v(T − 1) → v(T), (25) where n˜ = h∗ n is an effective noise after the signal detec- h2 where v(0) is the input information, p v(t) v(t 1) = tion.Weemp|lo|ytheblock-fadingchannelmodelin(6),where − the h keeps constant. Therefore, the n˜ follows a circularly 1 γ(t)v(t 1),γ(t)I , and p(v(T)) is modeled with N − − (cid:0) (cid:12) (cid:12) (cid:1) symmetric complex Gaussian distribution with zero mean and ((cid:16)0 (cid:112) ,I).Duetothereparam(cid:17)eterizationofnormaldistribution, scaled variance, σ2 =σ2/h2. v N (t) can be represented as Since the coeffi n˜ cients n o | f | p v(t) v(t 1) in (25) should − 2 v(t) = 1 γ(t)v(t 1)+ γ(t)(cid:15)(t) satisfy 1 γ(t) +γ(t) =(cid:0)1, we(cid:12) rewritt(cid:1)en xˆ as − − − (cid:12) =(cid:112)1 γ¯(t) 2 v(0)+(cid:112)γ¯(t)¯(cid:15)(t), (26) (cid:16)(cid:112) x˜ = (cid:17) 1 x+ σ n˜ (cid:15), (31) − √1+σ √1+σ (cid:113) n˜ n˜ (cid:0) (cid:1) where ¯(cid:15)(t) (0,I) and γ¯(t) = 1 (1 γ(t)). where x˜ =xˆ/√1+σ and n˜ =σ (cid:15),(cid:15) (0,I). ∼ N − t=1 − n˜ n˜ ∼CN Observe (26), the forward process recu(cid:113)rrently adds the Gaus- Comparing (31) with (26), we find that the wireless trans- (cid:81) sian noise step by step to make v(0) approach the normal mission is similar to the forward process. We model x and x˜ distribution, which can be viewed as the encoding processing in (31) as v(0) and v(t) in (26). It is natural to employ the without learnable parameters. reverseprocesstorefinex˜,suchthatobtainsthemoreaccurate 2) Reverse Process: The reverse process is also defined as x. Given the x˜ and σ n˜ , we adopt (27) to remove the noise a Markov chain with T steps starting at v(T), which is given in x˜ to closer the x. However, the existing variance schedule by of p v(t) v(t − 1) and sampling algorithm are unsuitable for wireless communications. We need to design the variance v(T) v(T − 1) v(T − 2) v(1) v(0), (27) sched (cid:0) ule a (cid:12) (cid:12)nd sam (cid:1) pling algorithm by considering the channel → → →···→ → SNR. where q v(t − 1) v(t) = µ v(t);ω ,σ(t)I . The reverse 1) Variance Schedule Design: A variance schedule refers N processg (cid:0) enerates (cid:12) the (cid:1) v(t − 1) b (cid:0) as (cid:0) edonv( (cid:1) t),inw (cid:1) hichthemean to the way in which the mean and variance of the added of v(t − 1) is mo(cid:12)deled with neural network with the v(t) as noisechangesoverthecourseofthediffusionprocess.During input. this process, the mean and variance of the added noise is From (26), we can observe that v(t − 1) can be predicted adjustedateachstep,affectingtheamountofnoiseintroduced with v(t) and v(0) by removing the added noise. Therefore, at each stage, therefore variance schedule determines how the µ v(t);ω can be modeled as noise level evolves during the diffusion process. A variance schedulecanimpactthequalityofgeneratedxandthemodel’s (cid:0) µ(v(t) (cid:1) ;ω)= 1 v(t) γ(t) (cid:15)(v(t);ω) . (28) convergence behavior. 1 γ(t) − γ¯(t) The variance schedule should satisfy the γ¯(T) 0. Based − (cid:18) (cid:19) ontheconstraint,wedesignthevarianceschedulew → ithT =50 where (cid:15) v(t);ω (cid:112) predicts the noise added to v(t). From (28), steps, which is given by thereverseprocesspredictstheGaussiannoiseateachstepand 0.5t thenrem (cid:0) ovesthe (cid:1) predictednoisetorestorethev(0) fromv(T) γ(t) = , (32) T with learnable parameters, which can be viewed the decoding processing. which γ¯(50) e − 6.375 1 . The designed variance schedule ≈ ≈ The loss function for the diffusion-based model at step t is includes 50 different noise levels. The reasons behind the defined as designed variance schedule can be summarized as Compared with the conventional diffusion-based frame- 2 • (t) =E ¯(cid:15)(t) (cid:15) 1 γ¯(t) 2 v(0)+γ¯(t)¯(cid:15)(t);ω,t . workwith1,000stepsforgenerativetasks,weempirically LDiff (cid:34)(cid:13) − (cid:18)(cid:113) − (cid:19)(cid:13) (cid:35) find that the de-noise task does not need too many steps (cid:13) (cid:0) (cid:1) (cid:13)(29) due to the low complexity of the de-noise task. (cid:13) (cid:13) (cid:13) (cid:13) Wedesignamonotonicfunctionofγ(t)toachievecoarse- • During training, we sample the t first and model the v(t) to-finede-noiseprocessing,whichhasanunequalinterval with v(0) by adding the Gaussian noise with the scheduled SNR, e.g., a small interval in high SNR regions and a variances. large interval in low SNR regions. The unequal interval Compared with the previous de-noise frameworks, e.g., SNR can speed up the de-noise processing with fewer DnCNN, that predict the noise with only one step, the steps at low SNR regions. de-noising diffusion framework can predict the noise with 2) Sampling Algorithm: The sampling algorithm performs multiple steps, such that matches the distributions of noise the reverse process by sampling the steps. For example, the and achieves better performance of de-noise. Therefore, we conventionaldiffusion-basedframeworkusuallysamples1,000 propose a de-noising diffusion-based signal detection method. steps from T 0 [36] or 100 steps with the subsequence → ---PAGE BREAK--- 2486 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Algorithm 3 Dynamic Sampling Algorithm 1 2 3 4 5 F u n a n y D n oi t c e h T : t u p nI ˜x e zil ai ti nI ˜t b e h t d ni F (cid:127) ˜t = t r of ) 1 t ( = v − 0( v : n r u t e R )( g nil p m a S ci m ˜xl a n gi s d e t c e t e d p g ni t r a t s e h t s a σ y h ) 3 ti 3 w ( . n˜ 1 o d (cid:127) 1 ) t ( v √ − )t( γ 1 − ) o :d n a v, t ni )t( γ )t( γ¯ σ n˜˜) t ( (cid:127) v ( .. ( t ) ; ω ) (cid:127) of T 0 [37], in which v(T) is the first sampled step. → However,startingfromv(T) isunsuitableforsignaldetection. The detected signals will start from different v(t) where t depends on the received SNR at the receiver. Therefore, we proposeadynamicsamplingalgorithmshowninAlgorithm3. Fig. 3. The PSNR performance comparison between Analog DeepSC and Firstly, given the known σ n˜ , we search the starting point t˜ AnalogDeepSCwithdifferentdenoisersontheKodakdataset. at the reverse process, which is given by Low-density parity check (LDPC) coding and 1 [γ¯(t˜+1),γ¯(t˜)]. (33) • capacity-achieved coding are used for the channel √1+σ ∈ n˜ coding. Then, the signal detection aims to recover the transmitted The adaptive modulation and coding (AMC) is • signals as more accurate as possible. Therefore, we change employed for different SNRs, including 1/2 coding the random sampling to deterministic sampling. In detail, ratewithBPSK,1/2codingratewithQPSK,3/4cod- we reduce the degree of randomness in the reverse process ing rate with QPSK, 1/2 coding rate with 16QAM, by setting σ(t) in (27) equals to zero, which means that and 3/4 coding rate with 16QAM. the q v(t − 1) v(t) changes from µ v(t);ω ,σ(t)I to Analog semantic communication systems: The purely deterministic µ v(t);ω . N • analogsemanticcommunicationofHDA-DeepSCtrained (cid:0) (cid:12) (cid:1) (cid:0) (cid:0) (cid:1) (cid:1) (cid:12) with MSE loss. (cid:0) (cid:1) V. NUMERICALRESULTS Digital semantic communication systems: The • In this section, we compare the proposed HDA-DeepSC DeepJSCC-Q proposed in [13]. with DL-based semantic communication systems and digital ConventionalHDAtransmissionsystemswith2Ddiscrete • communication systems over AWGN and Rician fading chan- cosine transform and scaler quantization [30]. nels, where we assume the perfect CSI for all schemes. Denoising convolutional neural network (DnCNN) as the • one-step de-noise benchmark The LDPC codes we use are from the 802.11ad standard, A. Implementation Details with blocklength 672 bits for both the 1/2 and 3/4 rate codes. 1) The Dataset: We choose the DIK2K dataset [38] for The coherent time is set as the transmission time for each training, which contains 1,000 images with different scenes. image in the simulation. We set r =1 for the Rician channels The Kodak dataset is used for testing. and h=1 for the AWGN channels. Peak signal-to-noise ratio 2) Training Settings: The semantic codec consists of 6 (PSNR) and multi-scale structural similarity (MS-SSIM) are Swin-Transformer layers, respectively. Each layer is with 6 used as the metrics to measure the local and global quality of heads and a width of 120. The diffusion-based model adopts images. The unit of MS-SSIM is dB by the structures of OpenAI-UNet. The λ , λ , and λ is 0.1, z r 0.1, and 0.0005, respectively. The learn F ing rate is 2 × 10 − 4. MS − SSIM(dB)= − 10log 10 (MS − SSIM). (34) ThedeviceforsimulationconsistsofIntelR XeonR Platinum (cid:13) (cid:13) 8352V and the NVIDIA GeForce RTX 4090. The encryption B. Denoising Networks Comparisons algorithm is AES encryption. Fig. 3 presents the PSNR performance for the analog 3) Benchmarks and Performance Metrics: We adopt the DeepSCwithdifferentdenoisers.Firstobservethattheanalog separatesource-channelcoding,theDL-basedanalogsemantic DeepSC with denoiser has a larger PSNR than that without communication system, the DL-based digital semantic com- denoiserinthelowSNRregimes.Thisvalidatestheeffective- municationsystem,andtheone-stepdenoisingnetworkasthe ness of the denoiser in reducing the noise level. For the small benchmarks, which are detailed as follows. noise level at the high SNR regimes, the analog DeepSC is Separate source-channel coding: Employ the source and capableofrestoringthesignalsthereforeallmethodsachievea • channel coding separately to transmit the images, we use similar PSNR as the SNR increases. Furthermore, we observe the following technologies, respectively: thattheanalogDeepSCwithDiffSDNetoutperformsthatwith Better Portable Graphics (BPG) for image source DnCNN with 0.6dB in terms of PSNR. This suggests that the • coding, the state-of-the-art image compression multiple-step denoiser has a stronger power of denoising than method. the one-step denoiser. ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2487 Fig.4. ComparisonbetweenHDA-DeepSCandtheAnalogDeepSC,DeepJSCC-Q,andBPGwithdifferentchannelcodingontheKodakdatasetoverAWGN channels. Fig.5. ComparisonbetweenHDA-DeepSCandtheAnalogDeepSC,DeepJSCC-Q,andBPGwithdifferentchannelcodingontheKodakdatasetoverRician channels. TABLEI channels with a 1/6 bandwidth compression ratio. For AWGN THEPSNRCOMPARISONBETWEENTHEANALOGDEEPSCWITH channels, we can see in Fig. 4 that our HDA-DeepSC out- DIFFERENTDIFFUSION-BASEDDENOISERSATSNR=0DB performs all the benchmarks. This indicates that the discrete signalsof the digital component canaccurately delivercrucial semantic information for details recovery and the continuous signals of the analog component can prevent the leveling-off and cliff-edge effects for lower quantization errors. Besides, the HDA-DeepSC achieves the best performance in terms of MS-SSIM,whichmeansthattheimagestransmittedbyHDA- TableIshowsthecomparisonbetweenanalogDeepSCwith DeepSC have better global quality. This is likely because we DDPM and DiffSDNet. The proposed DiffSDNet can achieve introducetheFourier-basedlossfunctionthatmakesthemodel higher PSNR with fewer sampling steps than the DDPM, learn the long-distance dependencies. For the Rician channel confirmingtheeffectivenessofthedesignedvarianceschedule case shown in Fig. 5, we observe that the DL-based analog and sampling algorithm. Especially, the PSNR of analog systems are more robust to channel changes due to the high DeepSCwithDDPMwilldecreaseasthenumberofsampling degree of freedom in continuous signals, in which the HDA- steps increases. This is due to the high degree of randomness DeepSC is beneficial from the analog component. Moreover, introduced in the reverse process. thelowbandwidthconsumptionofthedigitalpartallowsusto uselow-ratechannelcodingtoachieveaccuratedeliverywhile transmitting a small number of symbols, such as ensuring C. Communication System Comparisons robustness in the low SNR regimes. This is the reason why Figs. 4 and 5 report the PSNR and MS-SSIM comparison we assume error-free transmission while training the digital betweenthevariousmethodsoverAWGNchannelsandRician part. Besides, if the communication environment is terrible, ---PAGE BREAK--- 2488 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 Fig.6. PSNRandMS-SSIMperformancefordifferentbandwidthcompressionratiosontheKodakdatasetoverAWGNchannels. Fig.7. PSNRandMS-SSIMperformancefordifferentbandwidthcompressionratiosontheKodakdatasetoverRicianchannels. TABLEII codec and train the semantic codec with MSE loss function. THEABLATIONSOFFOURIER-BASEDCOMPONENT:MSELOSS, The Fourier-based module or loss can improve the quality MSELOSSWITHFOURIER-BASEDMODULE,ANDMSE of images with more than 2dB in terms of PSNR and MS- LOSSWITHFOURIER-BASEDLOSS SSIM due to the long-distance dependencies learning in the frequencydomain.Besides,weobservethattheFourier-based loss can largely increase MS-SSIM than the Fourier-based module. The reason behind that is the Fourier-based module introduces the additional Fourier-based parameters making it challenging to further improve its performance. This suggests that Fourier-based loss can directly capture the global infor- mation of images without additional parameters and hence as in which the digital signals cannot be successfully decoded, an attractive loss to improve the global quality of images. this system will experience the cliff-edge effect due to the employed entropy coding. This can be improved in several D. Bandwidth Compression Ratio Comparisons ways. One is to replace the entropy coding module with the learning-based quantization module. Another is to introduce Figs. 6 and 7 demonstrate the comparisons for different error transmission during training. Both methods can lead the bandwidth compression ratios over AWGN and Rician chan- model to learn to correct the errors in digital transmission. nels at SNR=10 dB. The HDA-DeepSC outperforms all the Visual examples are presented in Appendix B. benchmarks in terms of PSNR and MS-SSIM. For example, In Table II, we study the ablations of Fourier-based com- the HDA-DeepSC achieves the same PSNR as separate cod- ponents by only considering the semantic codec, in which the ings (the BPG with 1/2 LDPC and 16QAM) with a 33% MSE loss with Fourier-based module means that we insert improvement on bandwidth compression ratio. This suggests the pluggable Fourier-based modules [39] into the semantic that the HDA-DeepSC can provide a higher data transmission ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2489 Fig.8. PSNRandMS-SSIMperformancefordifferentdigital-analogratiosontheKodakdatasetoverAWGNchannels. Fig. 9. Visualized examples for different methods transmitted over AWGN channels at SNR=10dB: (a) original image; (b) image recovered by BPG with 1/2LDPCand16QAM;(c)imagerecoveredbyHDA-DeepSCwith0.2DAratiousingunencryptedbits;(d)-(f)imagerecoveredbyHDA-DeepSCwith0.2, 0.87,and3DAratiousingencryptedbits,respectively. rate than the benchmarks for a given PSNR or MS-SSIM. TABLEIII Besides, we find that the learning-based methods outperform THEPSNRPERFORMANCEFORTHEENCRYPTEDANDUNENCRYPTED theBPGintermsofMS-SSIM,indicatingtheneuralnetworks BITSOVERAWGNCHANNELSATSNR=10DB operate as the better content generator, thereby generating the image with global consistency. E. Digital-Analog Ratio Comparisons Fig.8showsthecomparisonsacrossdifferentdigital-analog (DA) ratios by changing the ratio between the number of information. This suggests that the analog transmitter oper- transmitted symbols of digital and analog components, where ates as a continuous signal-based system, thereby effectively the total number of transmitted symbols is fixed. The larger reducing the quantization errors by decreasing the DA ratio. DA ratio means more semantic information is transmitted with the digital transmitter and vice versa. We can observe F. Data Security that the PSNR and MS-SSIM decrease as the DA ratio increases, which is caused by the unavoidable quantization Table III reports the PSNR performance for the encrypted errorsintroducedbythedigitaltransmitter.Themoresemantic and unencrypted bits, where these terms refer to whether the information transmitted through the digital transmitter, the encryption algorithm encrypts the bit streams transmitted by larger the quantization errors introduced to the transmitted the digital transmitter. We assume that the eavesdropper is ---PAGE BREAK--- 2490 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 TABLEIV methodtoreducethebitbudgetoflearnedconstellations,such THERUNNINGTIMEPERIMAGECOMPARISONBETWEEN as achieving low-precision pseudo-analog transmission. The THEHDA-DEEPSCANDBPG cost is slight performance degradation. VI. CONCLUSION Inthispaper,wehaveintroducedaninnovativeHDAseman- tic communication framework that combines the strengths of analog and digital semantic communications. Our frame- incapable of decoding the encrypted bits and only decodes work aims to overcome the inherent limitations associated thesemanticinformationtransmittedbytheanalogtransmitter, with each approach. Building upon the framework, we intro- wheretheHDA-DeepSCmodelisknowntotheeavesdropper. duced a robust HDA semantic communication system called From Table III, the PSNR of encrypted bits is 20dB lower HDA-DeepSC, specifically designed for multimedia transmis- compared to that of unencrypted bits, indicating the images sion. HDA-DeepSC leverages digital communication methods recovered by encrypted bits are little like the original ones. In to transmit crucial semantic information, ensuring accurate other words, the eavesdropper obtains less information from delivery and data security. Additionally, it utilizes analog thesemanticinformationtransmittedbytheanalogtransmitter. communication methods to transmit auxiliary semantic infor- Besides, the PSNR of encrypted bits slightly decreases as mation, effectively mitigating the leveling-off and cliff-edge the DA ratio increases. This suggests that the HDA-DeepSC effects associated with traditional approaches. We also intro- effectively safeguards data with few bits while achieving the ducedanalog-digitalallocationandfusionmodulestoseparate high PSNR. Visual examples are presented in Figs. 9(c)-(f), and fuse the digital and analog components, respectively. where Figs. 9(d)-(f) are the images recovered by encrypted Besides, we have designed the Fourier-based loss function to bits.Interestingly,theessentialinformationisprotectedbythe guide the model in learning the long-distance dependencies HDA-DeepSC,e.g.thecolor,thebackground,andthetextures, and combined the rate constraint with the non-parametric, which proves the effectiveness of the HDA-DeepSC in data fully factorized density model. Moreover, we have proposed security. the diffusion framework enhanced signal detection, named DiffSDNet, by multiple denoising steps to reduce the noise G. Computational Complexity level at the low SNR regimes, in which we customized the The proposed HDA-DeepSC adopts the Swin-Transformer variance schedule and sampling algorithm for wireless com- as the semantic codec, in which the window multi-head munication environments. The numerical results have proved self-attention (W-MSA) module has high computational com- the effectiveness of DiffSDNet in denoising and demonstrated plexity. The computational complexity of W-MSA is O(N the superiority of HDA-DeepSC in terms of robustness, trans- h w (4C2+2M2C)),inwhichN,C,andM arethenumb × er missionrate,anddatasecurity,especiallyinlowSNRregimes. of × lay × ers, the width of the layer, and the number of patches, Therefore,theproposedHDAsemanticcommunicationframe- respectively. The channel codec consists of several dense workshowsgreatpromiseasacandidateforthenewsemantic layers,thecomputationalcomplexityofwhichisalsolinearin communication paradigm, offering significant potential for thenumberofpixels.Therefore,thecomputationalcomplexity real-world implementations. of the proposed HDA-DeepSC is linear encoding/decoding time in the number of pixels. To complete our discussion APPENDIXA of computational complexity, we have measured the average DERIVATIONOF(15) running time per image which is shown in Table IV. We can Assume the x , i = 1,2, ,N follows the N i.i.d. Gaus- i observe that the running time of HDA-DeepSC on the CPU ··· siansources(variables)withzeromeanandvarianceσ normal i is slightly slower than that of BPG on the CPU. However, distribution,thenthediscreteentropyofx=[x ,x , ,x ] 1 2 N the GPU can significantly accelerate the running time of ··· can be written as HDA-DeepSC, which means it can effectively support some delay-sensitive applications. H(x)= − E x ∼ p(x) [log 2 p(x)]= − E x ∼ p(x) log 2 i Π = N 1 p(x i ) (cid:20) (cid:21) H. Discussion of Hardware Implementation N x 2 = E log 2πσ2 i It is possible nowadays to implement analog systems with − i=1 xi∼ p(xi) (cid:20) 2 i − 2σ i 2 (cid:21) high-precision digital circuits, called pseudo-analog transmis- (cid:88) (cid:0) (cid:1) N x 2 N sion. For example, the pseudo-analog system SoftCast [40] = E i log 2πσ2 . (35) does not adopt the conventional constellations but modulates i=1 xi∼ p(xi) (cid:20) 2σ i 2 (cid:21) − i=1 2 i thenormalized2DdiscreteFouriercoefficientstothetransmit- (cid:88) (cid:88) (cid:0) (cid:1) With the (35), we can derive the following relationship, ted symbols directly. There are a lot of follow-up efforts, and someofthem[40],[41]havebeenvalidatedonsoftwareradio z 2 ˜z2 p (O la F tf D o M rm ) s . T w h i e th ref o o r r t e h , og it on is al fe f a r s e i q b u le en t c o y a d c i h v i i e s v io e n hy m b u ri l d tip a l n ex a i lo n g g H(z) − H(˜z)= i=1(cid:18) E zi∼ p(zi) (cid:20) 2σ i i 2 (cid:21) − E ˜zi∼ p(˜zi) (cid:20) 2σ˜ i i 2 (cid:21)(cid:19) (cid:88) and digital transmission on one hardware platform. For low- log σ i 2 precision digital circuits, we can employ the quantization − 2 σ˜2 i=1 (cid:18) i (cid:19) (cid:88) ---PAGE BREAK--- XIEetal.:HYBRIDDIGITAL-ANALOGSEMANTICCOMMUNICATIONS 2491 1 > E z 2 E ˜z2 [14] Y. Bo, Y. Duan, S. Shao, and M. Tao, “Joint coding-modulation for 2σ zi∼ p(zi) i − ˜zi∼ p(˜zi) i digital semantic communications via variational autoencoder,” IEEE i=1 (cid:88)(cid:0) (cid:2) (cid:3) (cid:2) (cid:3)(cid:1) Trans.Commun.,vol.72,no.9,pp.5626–5640,Sep.2024. σ2 log i , (36) [15] Y.He,G.Yu,andY.Cai,“Rate-adaptivecodingmechanismforsemantic − 2 σ˜2 communicationswithmulti-modaldata,”IEEETrans.Commun.,vol.72, (cid:88) i=1 (cid:18) i (cid:19) no.3,pp.1385–1400,Mar.2024. where σ and σ˜ are the variance of z and˜z , respectively. σ [16] L. Guo, W. Chen, Y. Sun, and B. Ai, “Device-edge digital semantic i i i i communication with trained non-linear quantization,” in Proc. IEEE is the maximum value between σ and σ˜ . i i 97thVeh.Technol.Conf.(VTC-Spring),Jun.2023,pp.1–5. σ2 We can observe the second term of (36), i.e., i, is the [17] C. Liu, C. Guo, Y. Yang, W. Ni, and T. Q. S. Quek, “OFDM-based σ˜2 constant. Especially, when z˜ is close to z, the seco i nd term digital semantic communication with importance awareness,” 2024, arXiv:2401.02178. will be zero. Therefore, we can drop the second term during [18] Q.Fuetal.,“Vectorquantizedsemanticcommunicationsystem,”IEEE training and only consider the first term of (36). With the WirelessCommun.Lett.,vol.12,no.6,pp.982–986,Jun.2023. Monte Carlo method, the entropy can be written as [19] Q.Hu,G.Zhang,Z.Qin,Y.Cai,G.Yu,andG.Y.Li,“Robustsemantic communicationswithmaskedVQ-VAEenabledcodebook,”IEEETrans. H(z) H(˜z) z2 ˜z2. (37) WirelessCommun.,vol.22,no.12,pp.8707–8722,Dec.2023. − ≈ − [20] H. Gao, G. Yu, and Y. Cai, “Adaptive modulation and retransmission Consideringthecomputationandtrainingcomplexity,werelax scheme for semantic communication systems,” IEEE Trans. Cognit. Commun.Netw.,vol.10,no.1,pp.150–163,Feb.2024. the (37) to the subtraction between z and z˜, which is the (15) [21] J. Huang, K. Yuan, C. Huang, and K. Huang, “D2-JSCC: Digital as follows. deepjointsource-channelcodingforsemanticcommunications,”2024, arXiv:2403.07338. [22] U. Mittal and N. Phamdo, “Hybrid digital-analog (HDA) joint source- APPENDIXB channel codes for broadcasting and robust communications,” IEEE VISUALIZEDRESULTS Trans.Inf.Theory,vol.48,no.5,pp.1082–1102,May2002. [23] T.Fujihashi,T.Koike-Akino,andT.Watanabe,“Softdelivery:Survey InFig.9(a)-(c),wecanobservetheproposedHDA-DeepSC onanewparadigmforwirelessandmobilemultimediastreaming,”ACM can restore more details, e.g., the mouth and feathers of Comput.Surv.,vol.56,no.2,pp.1–37,Sep.2023. the parrot, than the BPG with LDPC and 16QAM due to [24] M. Skoglund, N. Phamdo, and F. Alajaji, “Hybrid digital–analog delivering essential semantic information accurately by the source–channel coding for bandwidth compression/expansion,” IEEE Trans.Inf.Theory,vol.52,no.8,pp.3757–3763,Aug.2006. digital transmitter. [25] M. Ru¨ngeler, J. Bunte, and P. Vary, “Design and evaluation of hybrid digital-analog transmission outperforming purely digital concepts,” REFERENCES IEEETrans.Commun.,vol.62,no.11,pp.3983–3996,Nov.2014. [26] E.Ko¨kenandE.Tuncel,“Onrobustnessofhybriddigital/analogsource- [1] H.Xie,Z.Qin,Z.Han,andK.B.Letaief,“Hybriddigital-analogjoint channel coding with bandwidth mismatch,” IEEE Trans. Inf. Theory, semantic-channelcodingforimagetransmission,”inProc.IEEEGlobal vol.61,no.9,pp.4968–4983,Sep.2015. Commun.Conf.,CapeTown,SouthAfrica,Dec.2024,pp.1–6. [27] T.Fujihashi,T.Koike-Akino,T.Watanabe,andP.V.Orlik,“HoloCast+: [2] C.-X. Wang et al., “On the road to 6G: Visions, requirements, key Hybrid digital-analog transmission for graceful point cloud deliv- technologies, and testbeds,” IEEE Commun. Surveys Tuts., vol.25, ery with graph Fourier transform,” IEEE Trans. Multimedia, vol.24, no.2,pp.905–974,2ndQuart.,2023. pp.2179–2191,2022. [3] Z.Qin,X.Tao,J.Lu,W.Tong,andG.YeLi,“Semanticcommunica- [28] J. A. Hart, The Economics, Technology and Content of Digital TV. tions:Principlesandchallenges,”2021,arXiv:2201.01389. Boston,MA,USA:Springer,2004. [4] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled [29] L.Yu,H.Li,andW.Li,“Wirelessscalablevideocodingusingahybrid semantic communication systems,” IEEE Trans. Signal Process., digital-analog scheme,” IEEE Trans. Circuits Syst. Video Technol., vol.69,pp.2663–2675,2021. vol.24,no.2,pp.331–345,Feb.2014. [5] P. Yi, Y. Cao, X. Kang, and Y.-C. Liang, “Deep learning-empowered [30] C.Lan,C.Luo,W.Zeng,andF.Wu,“Apracticalhybriddigital-analog semanticcommunicationsystemswithasharedknowledgebase,”IEEE scheme for wireless video transmission,” IEEE Trans. Circuits Syst. Trans.WirelessCommun.,vol.23,no.6,pp.6174–6187,Jun.2024. VideoTechnol.,vol.28,no.7,pp.1634–1647,Jul.2018. [6] Z. Weng, Z. Qin, X. Tao, C. Pan, G. Liu, and G. Y. Li, “Deep [31] B. Tan, J. Wu, R. Wang, W. Luo, and J. Liu, “An optimal resource learning enabled semantic communications with speech recogni- allocationforhybriddigital–analogwithcombinedmultiplexing,”IEEE tion and synthesis,” IEEE Trans. Wireless Commun., vol.22, no.9, InternetThingsJ.,vol.6,no.1,pp.1125–1135,Feb.2019. pp.6227–6240,Sep.2023. [32] P. Yahampath, “Video coding for OFDM systems with imperfect CSI: [7] E. Grassucci, C. Marinoni, A. Rodriguez, and D. Comminiello, A hybrid digital–analog approach,” Signal Process., Image Commun., “Diffusion models for audio semantic communication,” in Proc. IEEE vol.87,Sep.2020,Art.no.115903. Int. Conf. Acoust., Speech Signal Process. (ICASSP), Seoul, South [33] Z.Liuetal.,“Swintransformer:Hierarchicalvisiontransformerusing Korea,Apr.2024,p.13. shiftedwindows,”inProc.IEEE/CVFInt.Conf.Comput.Vis.(ICCV), [8] T. Han, Q. Yang, Z. Shi, S. He, and Z. Zhang, “Semantic-preserved Oct.2021,pp.9992–10002. communicationsystemforhighlyefficientspeechtransmission,”IEEE [34] J.Balle,D.Minnen,S.Singh,S.J.Hwang,andN.Johnston,“Variational J.Sel.AreasCommun.,vol.41,no.1,pp.245–259,Jan.2023. imagecompressionwithascalehyperprior,”inProc.Int.Conf.Learn. [9] J. Dai et al., “Nonlinear transform source-channel coding for seman- Represent.,Vancouver,BC,Canada,Apr.2018. tic communications,” IEEE J. Sel. Areas Commun., vol.40, no.8, [35] H. Xie, Z. Qin, X. Tao, and K. B. Letaief, “Task-oriented multi-user pp.2300–2316,Aug.2022. semanticcommunications,”IEEEJ.Sel.AreasCommun.,vol.40,no.9, [10] G.Zhang,Q.Hu,Z.Qin,Y.Cai,G.Yu,andX.Tao,“Aunifiedmulti- tasksemanticcommunicationsystemformultimodaldata,”IEEETrans. pp.2584–2597,Sep.2022. Commun.,vol.72,no.7,pp.4101–4116,Jul.2024. [36] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilis- [11] H. Wu, Y. Shao, E. Ozfatura, K. Mikolajczyk, and D. Gu¨ndu¨z, tic models,” in Proc. Adv. Neural Inf. Process. Syst., Dec. 2020, “Transformer-aided wireless image transmission with channel pp.6840–6851. feedback,” IEEE Trans. Wireless Commun., vol.23, no.9, [37] J.Song,C.Meng,andS.Ermon,“Denoisingdiffusionimplicitmodels,” pp.11904–11919,Sep.2024. inProc.Int.Conf.Learn.Represent.,May2021. [12] S. Wang et al., “Wireless deep video semantic transmission,” IEEE [38] A. Ignatov et al., “PIRM challenge on perceptual image enhancement J.Sel.AreasCommun.,vol.41,no.1,pp.214–229,Jan.2023. onsmartphones:Report,”inProc.Eur.Conf.Comput.Vis.,Jan.2019, [13] T.-Y.Tung,D.B.Kurka,M.Jankowski,andD.Gu¨ndu¨z,“DeepJSCC- pp.315–333. Q: Constellation constrained deep joint source-channel coding,” IEEE [39] L.Chi,B.Jiang,andY.Mu,“FastFourierconvolution,”inProc.Adv. J.Sel.AreasInf.Theory,vol.3,no.4,pp.720–731,Dec.2022. NeuralInf.Process.Syst.,Dec.2020,pp.4479–4488. ---PAGE BREAK--- 2492 IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS,VOL.43,NO.7,JULY2025 [40] S.JakubczakandD.Katabi,“SoftCast:One-size-fits-allwirelessvideo,” Rebecca Moores Professor with the Electrical and Computer Engineering in Proc. ACM SIGCOMM Conf., New York, NY, USA, Aug. 2010, Department and the Computer Science Department, University of Houston, pp.449–450. Houston, TX, USA. His main research targets on the novel game-theory- [41] X. L. Liu, W. Hu, Q. Pu, F. Wu, and Y. Zhang, “ParCast: Soft video relatedconceptscriticaltoenablingefficientanddistributiveuseofwireless delivery in MIMO-OFDM WLANs,” in Proc. 18th Annu. Int. Conf. networks with limited resources, wireless resource allocation and manage- MobileComput.Netw.,Istanbul,Turkey,Aug.2012,pp.233–244. ment, wireless communications and networking, quantum computing, data science, smart grids, carbon neutralization, and security and privacy. He received a NSF Career Award in 2010, the Fred W. Ellersick Prize of the IEEECommunicationSocietyin2011,theBestPaperAwardoftheEURASIP Journal on Advances in Signal Processing in 2015, the IEEE Leonard G. Abraham Prize in the field of Communications Systems (Best Paper Award in IEEEJOURNALONSELECTEDAREASINCOMMUNICATIONS) in 2016, theIEEEVehicularTechnologySociety2022BestLandTransportationPaper Award,andseveralbestpaperawardsinIEEEconferences.HewasanIEEE Huiqiang Xie (Member, IEEE) received the B.S. Communications Society Distinguished Lecturer from 2015 to 2018 and an degreefromNorthwesternPolytechnicalUniversity, ACM Distinguished Speaker from 2022 to 2025. He has been an AAAS theM.S.degreefromChongqingUniversity,andthe Fellow since 2019 and an ACM Fellow since 2024. He has been a 1% Ph.D. degree from the Queen Mary University of Highly Cited Researcher since 2017 according to Web of Science. He is Londonin2023.From2023to2024,hewasaPost- also the Winner of the 2021 IEEE Kiyo Tomiyasu Award (an IEEE Field Doctoral Research Associate with The Hong Kong Award), for outstanding early to mid-career contributions to technologies University of Science and Technology, Guangzhou holding the promise of innovative applications, with the following citation: Campus.HeiscurrentlyanAssociateProfessorwith Forcontributionstogametheoryanddistributedmanagementofautonomous Jinan University. He received the 2023 IEEE ICC communicationnetworks. StudentTravelGrant,the2023IEEEICCBestPaper Award,andthe2023IEEESignalProcessingSociety Best Paper Award. He was also the Organizing Committee Co-Chair of 2024 EIECT. He is an Associate Editor of Journal of Communications and Networks. Zhijin Qin (Senior Member, IEEE) is currently an Associate Professor with Tsinghua University, Beijing, China. She was with the Imperial College London, London, U.K.; Lancaster University, Lan- KhaledB.Letaief(Fellow,IEEE)receivedtheB.S. caster,U.K.;andQueenMaryUniversityofLondon, degree(Hons.)inelectricalengineeringfromPurdue London, from 2016 to 2022. Her research interests University at West Lafayette, IN, USA, in Decem- includesemanticcommunicationsandsparsesignal ber 1984, the M.S. and Ph.D. degrees in electrical processing. She was a recipient of the 2017 IEEE engineering from Purdue University, in 1986, and GLOBECOM Best Paper Award, 2018 IEEE Sig- 1990, respectively, and the Ph.D. Honoris Causa nal Processing Society Young Author Best Paper degree from the University of Johannesburg, South Award,2021IEEECommunicationsSocietySignal Africa,in2022.Heisaninternationallyrecognized ProcessingforCommunicationsCommitteeEarlyAchievementAward,2022 leaderinwirelesscommunicationsandnetworks.He IEEE Communications Society Fred W. Ellersick Prize, and 2023 IEEE isamemberofUnitedStatesNationalAcademyof ICC Best Paper Award. She was a Guest Editor of IEEE JOURNAL ON Engineering, a fellow of Hong Kong Institution of SELECTEDAREASINCOMMUNICATIONS(JSAC)SpecialIssueonSemantic Engineers,amemberofIndiaNationalAcademyofSciences,andamember Communications and an Area Editor of IEEE JOURNAL ON SELECTED ofHongKongAcademyofEngineeringSciences.Heisalsorecognizedby AREAS IN COMMUNICATIONS Series. She was also the Symposium Co- ThomsonReutersasanISIHighlyCitedResearcherandwaslistedamongthe Chair of IEEE GLOBECOM 2020 and 2021. She is an Associate Editor of 2020top30ofAI2000InternetofThingsMostInfluentialScholars.Hewasa IEEE TRANSACTIONS ON COMMUNICATIONS, IEEE TRANSACTIONS ON recipientofmanydistinguishedawardsandhonors,includingthe2022IEEE COGNITIVENETWORKING,andIEEECOMMUNICATIONSLETTERS. Communications Society Edwin Howard Armstrong Achievement Award, 2021 IEEE Communications Society Best Survey Paper Award, 2019 IEEE CommunicationsSocietyandInformationTheorySocietyJointPaperAward, and 2016 IEEE Marconi Prize Paper Award in Wireless Communications. He has also been a dedicated teacher committed to excellence in teaching and scholarship. He received the Michael G. Gale Medal for Distinguished Teaching(highestuniversity-wideteachingawardandonlyonerecipient/year is honored for his/her contributions). Since 1993, he has been with The Hong Kong University of Science and Technology (HKUST), where he Zhu Han (Fellow, IEEE) received the B.S. degree has held many administrative positions, including the Acting Provost, the inelectronicengineeringfromTsinghuaUniversity, Head of the Electronic and Computer Engineering Department, and the Beijing, China, in 1997, and the M.S. and Ph.D. DirectorofHongKongTelecomInstituteofInformationTechnology.While degreesinelectricalandcomputerengineeringfrom at HKUST, he was the Chair Professor and the Dean of Engineering. He theUniversityofMarylandatCollegePark,College is well recognized for his dedicated service to professional societies and Park, MD, USA, in 1999 and 2003, respectively. IEEE, where he has served in many leadership positions. These include From2000to2002,hewasaResearchandDevelop- the Founding Editor-in-Chief of the prestigious IEEE TRANSACTIONS ON mentEngineerwithJDSU,Germantown,MD,USA. WIRELESS COMMUNICATIONS. He also served as the President of the From 2003 to 2006, he was a Research Associate IEEECommunicationsSociety(2018–2019),theworld’sleadingorganization with the University of Maryland at College Park. for communications professionals with headquarters in New York City and From 2006 to 2008, he was an Assistant Professor membersin162countries.HealsoservedasamemberfortheIEEEBoard with Boise State University, Boise, ID, USA. He is currently a John and ofDirectors.