Deep Reinforcement Learning-based Resource Allocation and Mode Selection for Semantic Communication Hyeonho Noh∗, Sojeong Park†, and Hyun Jong Yang∗ ∗Department of Electrical and Computer Engineering, Seoul National University, Korea †Department of Electrical Engineering, Pohang University of Science and Technology, Korea Abstract—In this paper, we aim to solve the joint resource extracts, compresses, and transmits features relevant to the allocation and mode selection problem, in which an agent intended task from data, rather than transmitting the raw data adaptivelyallocatescommunicationuserstoappropriateresource itself. Thus, semantic communication employs lossy data units and toggles between bit and semantic transmission modes compression, but it excels in the realm of task performance while determining the count of transmitted semantic symbols in semantic communication mode. Specifically, in contrast to efficiency [11]. the common yet unrealistic assumptions of prior research, In the field of text transmission, semantic communication which posits the possibility of limitless data transmission models like DeepSC [11] have demonstrated excellent over infinite periods, our focus shifts towards the realities of performance. However, they maintain a fixed transmission unsaturated traffic conditions, where users transmit a finite symbol size regardless of channel state information (CSI), amount of data within restricted time frames. In order to analogous to keeping the coding rate and modulation fixed evaluate the efficiency of data transmission within the semantic domain under unsaturated traffic conditions, we propose a in conventional communication. To take into account the short-term semantic transmission rate (SR), as an evaluation benefitsofchanneldiversity,aresourceallocation(RA)model metric of the joint problem. Under these unsaturated traffic that combines channel assignment and transmission volume scenarios, the challenge emerges from the need to address a control of semantic symbols was proposed [12]. Specifically, combinatorialissue,optimizingresourceallocation,transmission they defined the spectral efficiency in the realm of semantic mode selection, and symbol lengths simultaneously across the time-frequency axis. This task is compounded by the high communicationwhentransmittinginfinitesentencesoververy degree of complexity and a significant number of unknown long transmission times [12]–[14]. However, this assumption variables, making it a formidable challenge for conventional does not align with real-world scenarios, where user traffic optimization techniques to solve effectively. In response, we tends to be unsaturated, meaning that transmission time and propose a deep reinforcement learning-based method that in packet lengths are bounded by strict limitations [15]. each time step allocates users to each resource units, determines the communication transmission mode, and selects data size This paper goes beyond by addressing the joint RA and according to communication environment and users’ packet mode selection (MS) problem in unsaturated traffic scenarios, states.Extensiveexperimentsdemonstratesuperiorperformance whereUEsparticipateinuplinkcommunicationwhileholding over conventional schemes in terms of semantic transmission data of different sizes and numbers. The main contributions performance. are as follows: Index Terms—Semantic communication, Resource allocation, Deep reinforcement learning, Semantic rate, Mode selection • Building on the definition of semantic spectral efficiency in a long-term perspective, we propose a short-term semantic I. INTRODUCTION transmission rate (SR) to evaluate the data transmission In beyond 5G and 6G, wireless communication demands rate in unsaturated traffic conditions. The SR reflects more serving much more user equipments (UEs) with larger realistic communication scenarios, where the the frame amounts of data, resulting in the challenge of a shortage in length is strictly limited the length of data varies. the frequency spectrum [1], [2]. However, traditional wireless • Under the definition of SR, the performance superiority communicationhasbeenprimarilyfocusedonthetransmission between bit communication and semantic communication andreceptionofdatawithoutcomprehendingitsactualcontent changes depending on various signal-to-noise ratios (SNRs) [3],[4].Asaresult,theamountofdatathatcanbetransmitted and data sizes. Therefore, we propose a joint RA is strictly limited by the frequency spectrum in use. and MS problem that dynamically allocates UEs into To address the frequency spectrum shortage problem resource units (RUs) in the frequency domain, adaptively in conventional communication, task-oriented semantic selects transmission mode between bit and semantic communication, which can surpass the Shannon capacity in communication, and determines the number of transmitted terms of performing specific tasks, has been proposed and is semantic symbols for semantic communication. activelyunderresearch[3],[5]–[10].Semanticcommunication • To solve the proposed RA and MS optimization problem ISBN 978-3-903176-65-2 © 2024 IFIP 1 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply. ---PAGE BREAK--- Fig.1. Theproposeddeepreinforcementlearning-basedRAandMSprotocol whileconsideringbothUEs’SNRanddatasize,whichisan RUs. Constraint (1b) imposes the restriction by which each intractable problem due to its combinatorial aspect [16], we user can only occupy at most one channel. propose an algorithm based on deep reinforcement learning Let h ∈ C denote the uplink communication channel n,k (DRL), which has proven to be a powerful tool for solving between the BS and the k-th UE on the n-th RU. Then, complex resource management problems in recent year [5], the SNR for the k-th UE on the RU n is given by Γ = n,k [17], [18]. P |h |2/σ2. where P is the transmit power of the k-th n,k n,k n,k As a case study, we evaluate the proposed DRL-based RA UE on the RU n, and σ2 is the noise variance. and MS algorithm in the field of text transmission. Our C. Text Transmission Performance results demonstrate that the proposed DRL-based RA and MS algorithm can achieve superior performance in terms Many researchers rely on the specific yet well-developed of sentence similarity [11], [12], [19], [20] over various large language model, known as bi-directional encoder conventionalschemessuchasDeepSCandbitcommunication. representations from transformers (BERT) [21], to measure how accurate the semantic information is transmitted in text II. SYSTEMMODELANDPROBLEMFORMULATION transmissionfield[11],[12],[19],[20].Inthispaper,weadopt A. Scenario the calculate sentence similarity [12], which is defined by We consider a scenario in which a base station (BS) B(s)B(ˆs)T communicates with K UEs. Given the CSI and sentences F(s,ˆs)= , (2) ∥B(s)∥∥B(ˆs)∥ to transmit of the UEs, the BS allocates each UE to N RUs while also selecting the optimal transmission where B(s) represents the output embedding vector using mode, which could be either conventional bit or semantic the BERT model for a sentence s. We leverage a pre-trained communication. Additionally, if the BS decides to serve BERT model to compute the sentence similarity. Note that UE with semantic communication, it needs to determine fromthesimilaritydefinitionin(2),wehave0≤F(s,ˆs)≤1, the number of transmitted semantic symbols. The primary with 1 indicating the highest similarity and 0 indicating no objective of the RA and MS process is to maximize task- relationship between two sentences. specific performance metrics within the predefined packet D. Definition of Semantic Rate length for all UEs. The RA and MS process is shown in Fig. 1. With the definition of sentence similarity, SR is proposed in [12] for measuring the semantic information transmission B. Wireless Communication Model rate using BERT model. However, unlike the conventional We define a n,k as a binary RU assignment variable such approach, which calculates the average value of SR over that a n,k = 1 if the k-th UE is allocated on n-th RU, and infinite frame length when sending a large amount of data, in a n,k =0otherwise.Then,wecanrepresenttheconstraintson real communication environments, each user transmits limited the RA as follows: data of different sizes. Furthermore, all users must transmit (cid:88) data within a predetermined frame length to synchronize the a ≤1, ∀k ∈K (1a) n,k uplink transmission. To address these practical issues, we n∈N (cid:88) newly define the SR in this paper. a n,k ≤1, ∀n∈N (1b) Let D = {s = [w ,w ,...,w ]}Dk−1 k j,k j,k,0 j,k,1 j,k,Lj,k−1 j=0 k∈K denotethetextdatasetforthek-thUEwithsizeD ,wheres k j,k where N = {0,1,...,N −1} and K = {0,1,...,K −1}. isthej-thsentencewithlengthL andw isthel-thword j,k j,k,l Constraint (1a) indicates the unique user assignment along all ofthej-thsentenceofthek-thUE.Inaddition,onecandefine 2 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply. ---PAGE BREAK--- the amount of semantic information of s as I (suts). j,k j,k Each sentence is transmitted via either bit communication or semanticcommunication,asshowninFig.1.Wedenotem n,k as the binary transmission mode variable of the k-th UE on then-thRUsuchthatm =0representsbitcommunication n,k while m =1 means semantic communication. n,k In bit communication, the transmitter protects information from impairments such as noise or distortion by performing rate adaptation through source coding and channel coding based on the current SNR Γ . In the case of n,k semantic communication, successful transmission of semantic information is guaranteed by extracting semantic information and compressing the sentence length to c according to the n,k SNR Γ through semantic encoding and channel encoding. n,k The encoded symbol stream then can be represented by Fig.2. SemanticratetableaccordingtoSNRanddatasizec n,k. (cid:40) C (s;Γ ,m ), if m =0, x= bc n,k n,k n,k (3) C (s;Γ ,c ,m ,β), if m =1. sc n,k n,k n,k n,k follows: (cid:88) (cid:88) where C sc includes channel encoding, semantic encoding, max Φ= a n,k ϕ(D k ;Γ n,k ,c n,k ,m n,k ), (6a) while C includes channel encoding, source encoding, and a,c,m bc n∈Nk∈K modulation, β is the parameter set of semantic and channel s.t. (1a),(1b) (6b) encoder networks. If x is sent, the signal received at the (cid:88) c L ≤L ,∀n∈N,∀k ∈K, (6c) receiver will be y = hx+z, where z is the additive white n,k j,k frame Gaussian noise (AWGN) that follows CN(0,σ2I). With the j∈Dk received signal, the decoded sentence can be represented as (cid:88) Lˆ ≤L ,∀n∈N,∀k ∈K, (6d) j,k frame ˆs= (cid:40) C b − c 1(y;Γ n,k ,m n,k ), if m n,k =0, (4) c j n ∈ , D k k ∈N,∀n∈N,∀k ∈K, (6e) C−1(y;Γ ,c ,m ,β), if m =1, sc n,k n,k n,k n,k a ,m ∈{0,1},∀n∈N,∀k ∈K, (6f) n,k n,k where inverse operation for C means the reverse process of where a, c, and m are the set of all variable a , c , and n,k n,k C. Finally, the SR (suts/s) on n-th RU for k-th UE is defined m for n ∈ N and k ∈ K, respectively. Clearly, due to its n,k by nonconcave aspect, it is intractable to solve the RA and MS (cid:80)Dk−1WI ·F(s ,ˆs ) optimization problem [16]. ϕ(D ;Γ ,c ,m )= j=0 j,k j,k j,k , k n,k n,k n,k L frame III. PROPOSEDDRL-BASEDRAOPTIMIZATION (5) A. Proposed DRL structure where W is the bandwidth and L is the frame length. frame We propose a DRL structure consisting of an agent, which Note that the sentence similarity heavily depends on the performs RA and MS, based on the SNR and the data size. If design of C and C . In bit communication, the design of sc bc the allocated UE decides to utilize semantic communication, C sa b t c isfi is ed st t a h n a d t a (cid:80) rdi D ze k d −1 a L c ˆ cord ≤ ing L to SN w R he Γ re . Lˆ Then i , s i t t h m e u le s n t g b th e the dimension of channel encoder and decoder c n,k , i.e., the j=0 j,k frame j,k number of symbols for each word is selected to maximize the of C bc (s j,k ;Γ n,k ). In semantic communication, the optimal S-SR Φ in (6). We obtain the solution by precomputing the Φ channel coding dimension with respect to SNR has not been forallpossiblec andorganizingtheresultsintoanSRtable, n,k thoroughly surveyed. Thus, we define the channel coding as shown in Fig. 2. In the case where the agent chooses bit dimension of semantic communication for the n-th RU for communicationfordatatransmission,thesentenceisconveyed the k-th UE as c . Then, semantic communication transmits n,k using the conventional bit communication protocol. eachwordbypackingitwithasizeofc .Wedeterminethis n,k valuetoregulatethenumberoftransmittedsemanticsymbols. B. Definitions of Parameters in DRL Similar to the approach in bit communication, it is essential Here, we define the result of RA and MS, whether it’s to satisfy the condition (cid:80)D j= k 0 −1c n,k L j,k ≤L frame for the k-th bit communication or semantic communication, as an action. UE on the n-th RU. The BS selects actions corresponding to each RU index at each time step based on the current state. Therefore, one can E. Problem Formulation set t ∈ N. Then, the state space, action space, and reward From (1) and (5), we formulate the joint RA and MS functions of the agent are defined below. optimization problem that maximizes sum of SR (S-SR) as State Space: The state includes the CSI and dataset to 3 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply. ---PAGE BREAK--- transmit of the UEs, which is defined as ˜s n,k = {Γ n,k ,D k }. TABLEI Additionally,theinitialstateforallRUsandallUEsisdefined THES-SRCOMPARISONOFTHEPROPOSEDANDCONVENTIONAL as S = (cid:83) (cid:83) ˜s . When the k-th UE is selected as METHODSWITHRANDOMSNRANDRANDOMNUMBEROFSENTENCES. 0 n∈N k∈K n,k an action during the DRL procedure, we set the Γ = −1 n,k Random Random Max-SNR Max-SNR for all n to mark it as an unavailable option. +BC +SC +BC +SC Action Space: The action is defined by a ∈ A, S-SR 1,776 2,464 2,169 2,498 t which represents the result of RA and MS on the t- DRL DRL th RU. Thus, we can represent the action as a t = +BC +SC Proposed {(k,m )|a =1,∀k ∈K}. S-SR 2,374 3,091 3,113 t,k t,k Reward Function: We define the reward function of the (cid:80) agents as r = a ϕ . t k∈K t,k t,k coding dimension is fixed at eight and “Semantic” when the C. DRL Training Process channel coding dimension is optimized according to SNR. Initialization: We introduce the Deep Q-network (DQN) In the bit communication-based system, we adopt Huffman [22] as the learning framework of the agent. Thus, we utilize coding as a source coding and low-density parity check a parameter θ that defines an action-value function Q(S,a;θ) (LDPC) as a channel coding. We follow the 5G standard for the agent. In addition, we initialize replay memories E for in terms of coding rate and modulation and [26] to get the agent to capacity E. modulation and coding scheme index according to SNR. Experience collection: At each time step t, the agent We set the bandwidth W =180 kHz and the frame length iteratively collects experience by selecting the actions. Each L = 1024. We assume that the amounts of semantic frame actionisdrawninanepsilon-greedyfashionwithlineardecay, information of all sentence are equivalent, i.e., I = 1, for j,k i.e., ϵ(e) = max{1−e/Z,0.01}, where Z is the decaying all (j,k). In all experiments, the number of users is set to 5, rate constant, and e is the episode step. The agent first selects and the number of resource blocks is fixed at 5 3. a random action a with probability ϵ(e) or selects a = t t argmax Q(S ,a;θ), otherwise. The agent stores transition B. Result Analysis a t at each time-step (S ,a ,r ,S ) in E. We first conduct a comparative analysis between the t t t t+1 Updating model parameters: With the stored experiences in conventional and proposed schemes in a scenario involving the replay memories, the agent updates learning parameters, randomly varying data sizes ranging from 1 to 10 and SNR θ. In the case of θ, the agent samples random mini- levels distributed uniformly between 3 dB and 15 dB, which batch of B transitions (S ,a ,r ,S ) from E. We set is presented in Table I. From the result, we conclude that the j j j j+1 y = r if S is a terminal state or y = r + proposed DRL-based method achieves the highest S-SR over j j j+1 j j γmax Q(S ,a;θ), otherwise. Then, we get the training all conventional methods. a j+1 loss J(θ)= (cid:80) (y −Q(S ,a ;θ))2/B. The agent performs In the following, we assess the S-SR of the bit j j j j a gradient descent step on J(θ) and updates θ. communication only, semantic communication only, and proposed schemes with the DRL method across different IV. SIMULATIONRESULTS number of sentences, as shown in Fig 3, to ascertain the ToevaluatetheperformanceoftheproposedDRL-basedRA influenceofMS.WhenUEsendsarelativelysmallnumberof andMSalgorithmunderscenariowherebothsemanticandbit sentences, it can achieve higher S-SR with bit communication communication are available, we have conducted simulations becauseitcanreliablysendwithintheframelength.However, with the proposed DRL algorithm and baseline methods. when sending a large number of sentences, compressing sentences into semantic information and transmitting them A. Experimental Setup proves to be much more effective. Thus, the proposed method We adopt the datasets named European parliament that allows users to flexibly choose between two modes of bit proceedings parallel Corpus [23]. It includes around 2.0 and semantic communication based on the data size achieves million sentences and 53 million words. We sample 200,000 the highest S-SR compared to the other two communication sentence from the datasets and divides them into a training techniques. dataset and a test dataset. In addition, we collect the sentence Fig. 4 shows the S-SR of the proposed and conventional with the length of 4 to 30. methods along with different SNRs. In a low SNR We examine baselines in RA methods and communication environment, the S-SR of bit communication deteriorates due types. In RA methods, we investigate two methods: random to the failure of complete restoration of data. In contrast, and max-SNR [24], [25]. The random method chooses UEs semantic communication provides a significantly better S-SR regardlessofSNRanddatasizewhilethemax-SNRprioritizes in low SNR conditions; however, it shows a slightly lower S- UEs based sorely on SNR. In terms of communication types, SR compared to bit communication when the SNR exceeds semantic communication-based and bit communication-based or equals 9 dB. While semantic communication experiences systemsareconsidered.Inthesemanticcommunication-based some loss in S-SR performance due to lossy compression, system, we refer to it as “DeepSC” [11] when the channel bit communication achieves better performance in high SNR 4 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply. ---PAGE BREAK--- Korea government (MSIT) (No. RS-2023-00250191), and in part by the New Faculty Startup Fund from Seoul National University. REFERENCES [1] DenizGu¨ndz¨ etal., “Beyondtransmittingbits:Context,semantics,and task-oriented communications,” IEEE J. Sel. Areas Commun., vol. 41, no.1,pp.5–41,2023. [2] Yalin E. Sagduyu, Sennur Ulukus, and Aylin Yener, “Task-oriented communications for nextG: End-to-end deep learning and ai security aspects,” IEEEWirelessCommun.,vol.30,no.3,pp.52–60,2023. [3] Wanting Yang et al., “Semantic communications for future internet: Fig. 3. The S-SR comparison of the proposed and conventional methods Fundamentals,applications,andchallenges,” IEEECommun.Surv.Tut., with respect to the number of sentences. AWGN channel with a uniform vol.25,no.1,pp.213–250,2023. distributionofSNRfrom3dBto15dBisconsidered. [4] Christina Chaccour, Walid Saad, Me´rouane Debbah, Zhu Han, and H.VincentPoor,“Lessdata,moreknowledge:Buildingnextgeneration semanticcommunicationnetworks,” IEEECommun.SurveysTuts.,pp. 1–1,2024. [5] HaijunZhangetal., “DRL-drivendynamicresourceallocationfortask- orientedsemanticcommunication,” IEEETrans.Commun.,vol.71,no. 7,pp.3992–4004,2023. [6] HongweiZhangetal.,“Deeplearning-enabledsemanticcommunication systemswithtask-unawaretransmitteranddynamicdata,” IEEEJ.Sel. AreasCommun.,vol.41,no.1,pp.170–185,2023. [7] KeYangetal., “WITT:Awirelessimagetransmissiontransformerfor semantic communications,” in Proc. IEEE Int. Conf. Acoust. Speech SignalProcess.,2023,pp.1–5. [8] Huiqiang Xie, Zhijin Qin, and Geoffrey Ye Li, “Semantic communication with memory,” IEEE J. Sel. Areas Commun., vol. 41, no.8,pp.2658–2669,2023. [9] Guangming Shi et al., “From semantic communication to semantic- aware networking: model, architecture, and open problems,” IEEE Fig.4. TheS-SRcomparisonoftheproposedandconventionalmethodswith Commun.Magazine,vol.59,no.8,pp.44–50,2021. respecttoSNR.ThenumberofsentencesallUEposesistwo. [10] Xuewen Luo, Hsiao-Hwa Chen, and Qing Guo, “Semantic communications:Overview,openissues,andfutureresearchdirections,” IEEEWirelessCommun.,vol.29,no.1,pp.210–219,2022. [11] Huiqiang Xie, Zhijin Qin, Geoffrey Ye Li, and Biing-Hwang Juang, environments due to its precise data reconstruction. However, “Deeplearningenabledsemanticcommunicationsystems,”IEEETrans. the proposed method outperforms all baseline methods across SignalProcess.,vol.69,pp.2663–2675,2021. the entire SNR range by adaptively selecting the optimal [12] Lei Yan, Zhijin Qin, Rui Zhang, Yongzhao Li, and Geoffrey Ye Li, “Resourceallocationfortextsemanticcommunications,”IEEEWireless transmission mode. Commun.Lett.,vol.11,no.7,pp.1394–1398,2022. [13] XidongMuetal., “Heterogeneoussemanticandbitcommunications:A V. CONCLUSION semi-noma scheme,” IEEE J. Sel. Areas Commun., vol. 41, no. 1, pp. 155–169,2023. We proposed a DRL-based algorithm for optimizing [14] XidongMuandYuanweiLiu, “Exploitingsemanticcommunicationfor joint RA and MS, effectively allocating UEs to RUs and non-orthogonalmultipleaccess,” IEEEJ.Sel.AreasCommun.,vol.41, determining the optimal transmission mode between semantic no.8,pp.2563–2576,2023. [15] HyeonhoNoh,HarimLee,andHyunJongYang,“Jointoptimizationon and bit-based communication. Our approach dynamically uplinkOFDMAandMU-MIMOforIEEE802.11ax:Deephierarchical adjusts the number of transmitted semantic symbols, reinforcementlearningapproach,” IEEECommun.Lett.,pp.1–5,2024. addressing the complexity of unsaturated traffic conditions. [16] NanZhaoetal., “Deepreinforcementlearningforuserassociationand Experiments show superior performance over traditional resource allocation in heterogeneous cellular networks,” IEEE Trans. WirelessCommun.,vol.18,no.11,pp.5141–5152,2019. schemes like DeepSC and bit communication, particularly in [17] Haijun Zhang et al., “Power control based on deep reinforcement termsofsentencesimilarity.Futureworkwillfocusonrefining learning for spectrum sharing,” IEEE Trans. Wireless Commun., vol. the definition and quantification of semantic information in 19,no.6,pp.4209–4219,2020. [18] ShaoyangWangetal., “JointresourcemanagementforMC-NOMA:A sentence data and expanding the framework to more complex deepreinforcementlearningapproach,”IEEETrans.WirelessCommun., networkscenarios.Thiswillenhancethesystem’sadaptability vol.20,no.9,pp.5672–5688,2021. and efficiency, paving the way for more intelligent semantic [19] ZiQinLiewetal., “Economicsofsemanticcommunicationsystemin wireless powered internet of things,” in Proc. IEEE Int. Conf. Acoust. communication solutions in evolving wireless networks. SpeechSignalProcess.,2022,pp.8637–8641. [20] Tianxiao Han et al., “Semantic-preserved communication system for VI. ACKNOWLEDGEMENT highlyefficientspeechtransmission,” IEEEJ.Sel.AreasCommun.,vol. 41,no.1,pp.245–259,2023. ThisworkwassupportedinpartbyInstituteofInformation [21] Matthew E. Peters et al., “Deep contextualized word representations,” & communications Technology Planning & Evaluation (IITP) inProc.NorthAmer.ChapterAssoc.Comput.Linguistics:Hum.Lang. grant funded by the Korea government (MSIT) (No.2021-0- Tech.,NewOrleans,Louisiana,June2018,pp.2227–2237. [22] Volodymyr Mnih et al., “Human-level control through deep 00161, 6G MIMO System Research), in part by the National reinforcementlearning,” Nature,vol.518,no.7540,pp.529–533,Feb. Research Foundation of Korea (NRF) grant funded by the 2015. 5 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply. ---PAGE BREAK--- [23] Philipp Koehn, “Europarl: A parallel corpus for statistical machine translation,” inMTsummit,2005,pp.79–86. [24] Shengli Liu et al., “Joint user association and resource allocation for wireless hierarchical federated learning with IID and non-IID data,” IEEETrans.WirelessCommun.,vol.21,no.10,pp.7852–7866,2022. [25] Amin Abdel Khalek, Constantine Caramanis, and Robert W. Heath, “Delay-constrainedvideotransmission:Quality-drivenresource allocationandscheduling,” IEEEJ.Sel.TopicsSignalProcess.,vol.9, no.1,pp.60–75,2015. [26] Eunmi Chu, Janghyuk Yoon, and Bang Chul Jung, “A novel link- to-system mapping technique based on machine learning for 5G/IoT wirelessnetworks,” Sensors,vol.19,no.5,pp.1196,2019. 6 Authorized licensed use limited to: WUHAN UNIVERSITY OF TECHNOLOGY. Downloaded on February 09,2026 at 07:53:50 UTC from IEEE Xplore. Restrictions apply.