前言 最近在做的项目需要在两个手机之间应用 TCP 实时传递音频流,为也节省流量还要使用 AAC 进行编解码。因此对 AAC 进行了相关的调查。由于对音频数据进行了 AAC 编码处理,因此在播放声音时,就要用到 AudioTrack
。
阅读前提
有“一定” Android 开发经验(至少要独立写过 Android 代码。能独立解释问题。)(这里有吐槽,详见文末)
简单了解 MediaCodec 的使用。
阅读本文您将学到
如何在 Android 设备上录制声音
如何使用 MediaCodec 将 PCM 编码成 AAC
如何使用 MediaCodec 将 AAC 解码成 PCM
如何在 Android 设备上播放 PCM
如果将 PCM 转化成 WAV
音频相关基本知识
PCM(Pulse-code modulation)PCM 是把声音从模拟信号转化为数字信号的技术。
PCM 数据就是音频的原始数据。无论何种音频格式,音频驱动程序最终处理的都是 PCM 数据。由于 PCM 数据未经音频编码,因此数据量非常大。有点类似于 BMP 图片文件的感觉。
PCM 音频文件是无法被播放器直接播放的,因此它仅仅保存了音频的原始数据。PCM 文件中并没有记录所保存的 PCM 数据的编码信息,例如采样率,声道数,采样精度等,因此播放器就无法播放该 PCM 文件。
人耳能够感觉到的最高频率为 20kHz,要满足人耳的听觉要求,则需要每秒进行 40k 次采样,即 40kHz。我们常见的 CD 采样率就是 44.1kHz。
声道数(Channel):一般表示声音录制时的音源数量或回放时相应的扬声器数量。常用的是单声道(Mono)和双声道(Stereo)。
采样精度 (Bit Depth):每个采样点用多少二进制位表示。单位:位(Bit)。位数越多声音质量就越好,当然数据量也越大。常见的位宽是:8 bit 或者 16 bit。主流采样精度为 16bit,在低质量的语音传输时候可以用 8bit。
比特率 (Bitrate)(在音/视频行业里,也叫码流或码率):单位时间内传输送或处理的比特 的数量。单位:bps(Bit Per Second)或 kbps。比特率越高,压缩比越小,声音质量越好,音频数据量也越大。
常用音频码率值如下:
WAV(WAVE 文件):常见的音频格式。未经压缩的可播放的音频原始文件。该文件由两部分组成: WAV 头及 PCM 音频数据。也就是说给 PCM 文件添加上 WAV 头,播放器就可以直接播放该音频文件了。
AAC(Advanced Audio Coding) :常见的音频格式。是一种专为声音数据设计的有损文件压缩格式。是应用非常广泛的音频压缩格式。相较于 mp3,AAC 格式的音质更佳,文件更小。Android 硬件编码天生支持 AAC。我们采集的原始 PCM 音频数据,一般不直接用来网络传输,而是经过编码器压缩成 AAC,这样就提高了传输效率,节省了网络带宽。
我之前介绍H264 基础知识 时,也用到了 MediaCodec,但是那篇文件里并没有写代码相关的东西。由于用法都一样,在这里把代码也补一下吧。
Android 在 API 16(Android 4.1) 后引入的音视频编解码 API,Android 应用层统一由 MediaCodec API 提供音视频编解码的功能,由参数配置来决定采用何种编解码算法、是否采用硬件编解码加速等。网传:“由于使用硬件编解码,兼容性有不少问题,据说 MediaCodec 坑比较多。”不过现在的主流设备 Android API 版本远高于 16,并且硬件条件也越来越好,因此在我手边的这些设备中并没有遇到所谓的兼容性问题。“MediaCodec 坑比较多”是真的,经常由于配置不对等导致无法播放的问题。
MediaCodec 采用了基于环形缓冲区的「生产者-消费者」模型,异步处理数据。在 input 端,Client 是这个环形缓冲区「生产者」,MediaCodec 是「消费者」。在 output 端,MediaCodec 是这个环形缓冲区「生产者」,而 Client 则变成了「消费者」。
工作流程是这样的:
Client 从 input 缓冲区队列申请 empty buffer [dequeueInputBuffer]
Client 把需要编解码的数据拷贝到 empty buffer,然后放入 input 缓冲区队列 [queueInputBuffer]
MediaCodec 从 input 缓冲区队列取一帧数据进行编解码处理
处理结束后,MediaCodec 将原始数据 buffer 置为 empty 后放回 input 缓冲区队列,将编解码后的数据放入到 output 缓冲区队列
Client 从 output 缓冲区队列申请编解码后的 buffer [dequeueOutputBuffer]
Client 对编解码后的 buffer 进行渲染/播放
渲染/播放完成后,Client 再将该 buffer 放回 output 缓冲区队列 [releaseOutputBuffer]
MediaCodec 基本使用流程:
1 2 3 4 5 6 7 8 9 10 11 - createEncoderByType/createDecoderByType - configure - start - while (true ) { - dequeueInputBuffer - queueInputBuffer - dequeueOutputBuffer - releaseOutputBuffer } - stop - release
整体项目业务流程 发送端 录音(PCM) -> 使用 MediaCodec 编码(AAC)-> 发送 AAC 音频流
接收端 接收 AAC 音频流 -> 使用 MediaCodec 解码(PCM) -> 播放 PCM
整个流程还是比较清晰的,涉及到的技术点:
录音
使用 MediaCodec 对音频数据进行编解码(PCM <-> AAC)
播放 PCM
注意: 文中涉及到的所有源代码中均已删除涉密信息。不影响使用。直接复制源代码会显示缺少日志类或常量类,需要使用你自己的日志工具类来做替换,常量的话,其常量名已经很好的表达其含义了,可以手动改成你需要的值。
录音 + 编码(PCM -> AAC) 关于 Android 录音很简单,网上的例子也很多,也没有什么坑,这里就不详细介绍了。大致说下流程,然后直接上代码。
录音涉及三要素:采样率,声道数和采样精度。为了保证兼容性,推荐的配置是 44.1kHz、单通道、16 位精度。
录音流程
创建 AudioRecord
对象,其中最小录音缓存参数可以通过 AudioRecord.getMinBufferSize
方法得到。如果设置的缓存容量过小,将会导致对象构造失败。建议将该参数设置成 AudioRecord.getMinBufferSize
返回结果的 2~4倍。
初始化一个buffer,该buffer大于等于 AudioRecord
对象用于写声音数据的 buffer 大小。
开始录音
创建一个数据流,一边从 AudioRecord
中读取声音数据到初始化的 buffer,一边将 buffer 中数据导入数据流。
关闭数据流
停止录音
编码时需要用到和音频相关的参数有:采样率、声道数、码率(比特率)和 AAC 规格 (AAC Profile)。常用的 AAC Profile 是 AAC-LC( 低复杂度规格(Low Complexity))。这种规格在中等码率的编码效率以及音质方面,都能找到平衡点。所谓中等码率,就是指:96kbps-192kbps 之间的码率。因此,如果要使用该规格,请尽可能把码率控制在之前说的那个区间内。
回音消除/自动增益/噪声抑制 通常情况下,录制完的音频会有回音现象,可以通过一些设置来解决。
方法一:
初始化 AudioRecord
时,使用 MediaRecorder.AudioSource.VOICE_COMMUNICATION
替换 MediaRecorder.AudioSource.MIC
。VOICE_COMMUNICATION
会开启回音消除及自动增益从而改善录制效果。
方法二:
启用对应的设置类完成“回音消除/自动增益/噪声抑制”。AcousticEchoCanceler
,AutomaticGainControl
及 NoiseSuppressor
。使用它们时要先判断设备是否支持对应的功能。详细用法见下文源码。
相关源码 MicRecord.java
说明: 该类会边录音边进行 AAC 编码。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 import android.media.AudioRecord;import android.media.MediaRecorder;import android.media.audiofx.AcousticEchoCanceler;import android.media.audiofx.AutomaticGainControl;import android.media.audiofx.NoiseSuppressor;import android.os.SystemClock;import java.io.BufferedOutputStream;import java.io.File;import java.io.FileOutputStream;import java.util.Objects;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;public class MicRecord { private static final String TAG = MicRecord.class.getSimpleName(); private ExecutorService mThreadPool = Executors.newFixedThreadPool(1 ); private boolean mIsRecording; private AacEncoder mAacEncoder; private AudioRecord mAudioRecord; private int mSampleRate; private int mChannelMask; private int mAudioFormat; private int mRecordBufferSize; public MicRecord (int sampleRate, int bitrate, int channelCount, int channelMask, int audioFormat) { this (sampleRate, bitrate, channelCount, channelMask, audioFormat, AudioRecord.getMinBufferSize(sampleRate, channelMask, audioFormat) * 2 ); } public MicRecord (int sampleRate, int bitrate, int channelCount, int channelMask, int audioFormat, int recordBufferSize) { CLog.i(TAG, "recordBufferSize=%d" , recordBufferSize); mSampleRate = sampleRate; mChannelMask = channelMask; mAudioFormat = audioFormat; mRecordBufferSize = recordBufferSize; mAacEncoder = new AacEncoder(sampleRate, bitrate, channelCount); createAudioRecord(); initAdvancedFeatures(); } private void createAudioRecord () { mAudioRecord = new AudioRecord( MediaRecorder.AudioSource.VOICE_COMMUNICATION, mSampleRate, mChannelMask, mAudioFormat, mRecordBufferSize); } private void initAdvancedFeatures () { if (AcousticEchoCanceler.isAvailable()) { AcousticEchoCanceler aec = AcousticEchoCanceler.create(mAudioRecord.getAudioSessionId()); if (aec != null ) { aec.setEnabled(true ); } } if (AutomaticGainControl.isAvailable()) { AutomaticGainControl agc = AutomaticGainControl.create(mAudioRecord.getAudioSessionId()); if (agc != null ) { agc.setEnabled(true ); } } if (NoiseSuppressor.isAvailable()) { NoiseSuppressor nc = NoiseSuppressor.create(mAudioRecord.getAudioSessionId()); if (nc != null ) { nc.setEnabled(true ); } } } public void stop () { mIsRecording = false ; if (null != mAudioRecord) { try { mAudioRecord.stop(); } catch (Exception e) { CLog.e(TAG, e); } } } public void release () { stop(); if (null != mAudioRecord) { try { mAudioRecord.release(); } catch (Exception e) { CLog.e(TAG, e); } mAudioRecord = null ; } if (mAacEncoder != null ) { try { mAacEncoder.close(); } catch (Exception e) { CLog.e(TAG, e); } mAacEncoder = null ; } if (mThreadPool != null ) { mThreadPool.shutdownNow(); } } public void doRecord (Callback callback) { mAudioRecord.startRecording(); mIsRecording = true ; mThreadPool.execute(() -> { try { byte [] audioData = new byte [mRecordBufferSize]; BufferedOutputStream os = null ; BufferedOutputStream aacOs = null ; if (Constants.DEBUG_MODE) { String outputFolder = Objects.requireNonNull(CustomApplication.getInstance().getExternalFilesDir(null )).getAbsolutePath() + File.separator + "leo-audio" ; File folder = new File(outputFolder); if (!folder.exists()) { boolean mkdirStatus = folder.mkdirs(); CLog.i(TAG, "mkdir [%s] %s" , outputFolder, mkdirStatus); } File file = new File(outputFolder, "original.pcm" ); File aacFile = new File(outputFolder, "original.aac" ); String filename = file.getAbsolutePath(); String aacFilename = aacFile.getAbsolutePath(); os = new BufferedOutputStream(new FileOutputStream(filename)); aacOs = new BufferedOutputStream(new FileOutputStream(aacFilename)); } byte [] aacAudioData; long st; int readSize; while (mIsRecording) { readSize = mAudioRecord.read(audioData, 0 , mRecordBufferSize); if (AudioRecord.ERROR_INVALID_OPERATION != readSize) { st = SystemClock.elapsedRealtime(); aacAudioData = mAacEncoder.encodePcmToAac(audioData); CLog.i(TAG, "Encode audio cost=%d" , SystemClock.elapsedRealtime() - st); if (Constants.DEBUG_MODE) { os.write(audioData); aacOs.write(aacAudioData); } callback.onCallback(aacAudioData); } } } catch (Exception e) { CLog.e(TAG, e); } }); } public boolean isRecording () { return mIsRecording; } public interface Callback { void onCallback (byte [] data) ; } }
AacEncoder.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 import android.media.MediaCodec;import android.media.MediaCodecInfo;import android.media.MediaFormat;import java.io.ByteArrayOutputStream;import java.io.IOException;import java.nio.ByteBuffer;public class AacEncoder { private static final String TAG = AacEncoder.class.getSimpleName(); private MediaCodec.BufferInfo mBufferInfo; private long mPresentationTimeUs = 0 ; private ByteArrayOutputStream mOutputAacStream = new ByteArrayOutputStream(); private MediaCodec mAudioEncoder; public AacEncoder (int sampleRate, int bitrate, int channelCount) { try { mAudioEncoder = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC); } catch (IOException e) { CLog.e(TAG, e, "Init AacEncoder error." ); } MediaFormat mediaFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, sampleRate, channelCount); mediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC); mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, bitrate); mediaFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 32 * 1024 ); mAudioEncoder.configure(mediaFormat, null , null , MediaCodec.CONFIGURE_FLAG_ENCODE); mAudioEncoder.start(); mBufferInfo = new MediaCodec.BufferInfo(); } public byte [] encodePcmToAac(byte [] pcmData) throws Exception { int inputBufferIndex = mAudioEncoder.dequeueInputBuffer(-1 ); if (inputBufferIndex >= 0 ) { ByteBuffer inputBuffer = mAudioEncoder.getInputBuffer(inputBufferIndex); if (inputBuffer != null ) { inputBuffer.clear(); inputBuffer.put(pcmData); } long pts = computePresentationTimeUs(mPresentationTimeUs); mAudioEncoder.queueInputBuffer(inputBufferIndex, 0 , pcmData.length, pts, 0 ); mPresentationTimeUs += 1 ; } int outputBufferIndex = mAudioEncoder.dequeueOutputBuffer(mBufferInfo, 0 ); while (outputBufferIndex >= 0 ) { ByteBuffer outputBuffer = mAudioEncoder.getOutputBuffer(outputBufferIndex); if (outputBuffer != null ) { int outAacDataSize = mBufferInfo.size; int outAacDataSizeWithAdts = outAacDataSize + 7 ; outputBuffer.position(mBufferInfo.offset); outputBuffer.limit(mBufferInfo.offset + outAacDataSize); byte [] outAacDataWithAdts = new byte [outAacDataSizeWithAdts]; addAdtsToData(outAacDataWithAdts, outAacDataSizeWithAdts); outputBuffer.get(outAacDataWithAdts, 7 , outAacDataSize); outputBuffer.position(mBufferInfo.offset); mOutputAacStream.write(outAacDataWithAdts); } mAudioEncoder.releaseOutputBuffer(outputBufferIndex, false ); outputBufferIndex = mAudioEncoder.dequeueOutputBuffer(mBufferInfo, 0 ); } mOutputAacStream.flush(); byte [] outAacBytes = mOutputAacStream.toByteArray(); mOutputAacStream.reset(); return outAacBytes; } private void addAdtsToData (byte [] outAacDataWithAdts, int outAacDataLenWithAdts) { int profile = 2 ; int freqIdx = 4 ; int chanCfg = 1 ; outAacDataWithAdts[0 ] = (byte ) 0xFF ; outAacDataWithAdts[1 ] = (byte ) 0xF9 ; outAacDataWithAdts[2 ] = (byte ) (((profile - 1 ) << 6 ) + (freqIdx << 2 ) + (chanCfg >> 2 )); outAacDataWithAdts[3 ] = (byte ) (((chanCfg & 3 ) << 6 ) + (outAacDataLenWithAdts >> 11 )); outAacDataWithAdts[4 ] = (byte ) ((outAacDataLenWithAdts & 0x7FF ) >> 3 ); outAacDataWithAdts[5 ] = (byte ) (((outAacDataLenWithAdts & 7 ) << 5 ) + 0x1F ); outAacDataWithAdts[6 ] = (byte ) 0xFC ; } public void close () { try { if (mAudioEncoder != null ) { mAudioEncoder.stop(); mAudioEncoder.release(); } if (mOutputAacStream != null ) { mOutputAacStream.flush(); mOutputAacStream.close(); } } catch (Exception e) { CLog.e(TAG, e, "close error." ); } } public static long computePresentationTimeUs (long frameIndex) { return frameIndex * 32000 * 1024 / 44100 ; } }
其中用到的其它类如下:
CustomApplication.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import android.content.Context;import android.util.Log;public class CustomApplication extends MultiDexApplication { private static final String TAG = CustomApplication.class.getSimpleName(); private static CustomApplication mInstance; @Override public void onCreate () { super .onCreate(); Log.i(TAG, "onCreate()" ); mInstance = this ; } public static CustomApplication etInstance () { return mInstance; } }
Callback.java
1 2 3 public interface Callback { void onCallback (byte [] data) ; }
解码 + 播放 解码 关于使用 MediaCodec 的用法,可以回看下“关于 MediaCodec”的说明。这里就不赘述了。另外使用 MediaCodec 进行编码和解码,从代码层面上来看非常相似。
解码时需要用到的必要参数有:采样率,声道数,AAC 规格 (AAC Profile) 以及 CSD-0
(详见后文介绍)。当然,在实际使用时,通常还会设置一些其它参数。
需要注意的时,解码时用到的 CSD-0
(详见后文介绍)并不是一个固定值,而是根据采样率,声道数和 AAC Profile 计算出来的(计算方法见源码注释)。网上很多代码都是直接给的值,并没有告诉你应该怎么计算。如果设置 CSD-0
与采样率,声道数和 AAC Profile 不匹配的话,会导致解码出错的。这也是最在开发过程中最常见的问题。
解码时有一个大坑,就是可供设置的 MediaFormat
值非常多,但是我们并不需要一一指定。不恰当的设置会导致解码出错。所以大家遇到解码失败时,最大的可能就是参数设置问题。我在源码中注释的地方指名了一些常见的设置项,供大家参考。
例如:如果编码 AAC 时指定了 ADTS 头,那么解码时就要将 MediaFormat.KEY_IS_ADTS
设置成 1
。
播放 播放 PCM 时和音频相关的必要参数:采样率,采样精度,通道掩码(channelConfig)。
通过 MediaCodec 数音频数据解码成 PCM 后,就可以直接使用 AudioTrack
类进行播放了。播放的代码很简单,没什么坑,就不详细讲解了,直接上代码。
相关源码 AacDecoder.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 import android.media.AudioManager;import android.media.AudioTrack;import android.media.MediaCodec;import android.media.MediaCodecInfo;import android.media.MediaFormat;import android.os.SystemClock;import java.io.BufferedOutputStream;import java.io.File;import java.io.FileOutputStream;import java.io.IOException;import java.nio.ByteBuffer;import java.util.Objects;import java.util.concurrent.atomic.AtomicLong;public class AacDecoder { private static final String TAG = AacDecoder.class.getSimpleName(); private BufferedOutputStream mAacOs; private AtomicLong presentationTimeUs = new AtomicLong(0 ); private AudioTrack mAudioTrack; private MediaCodec mAudioDecoder; public AacDecoder (int sampleRate, int channelConfig, int audioFormat, byte [] csd0, int trackBufferSize) { CLog.i(TAG, "sampleRate=%d channelConfig=%d audioFormat=%d trackBufferSize=%d csd0=%s" , sampleRate, channelConfig, audioFormat, trackBufferSize, JsonUtil.toHexadecimalString(csd0)); initAudioDecoder(sampleRate, csd0); initAudioTrack(sampleRate, channelConfig, audioFormat, trackBufferSize); } public AacDecoder (int sampleRate, int channelConfig, int audioFormat, byte [] csd0) { this (sampleRate, channelConfig, audioFormat, csd0, AudioTrack.getMinBufferSize(sampleRate, channelConfig, audioFormat) * 2 ); } public void writeDataToDisk (byte [] audioData) { try { if (mAacOs != null ) { mAacOs.write(audioData); } else { CLog.e(TAG, "mAacOs is null" ); } } catch (Exception e) { CLog.e(TAG,"You can ignore this message safely. writeDataToDisk error" ); } } public void closeOutputStream () { if (mAacOs == null ) { return ; } try { CLog.w(TAG, "END-OF-AUDIO close stream." ); mAacOs.flush(); mAacOs.close(); } catch (Exception e) { CLog.e(TAG, e, "closeOutputStream error" ); } } public void initOutputStream () { CLog.w(TAG, "START-AUDIO init stream." ); String outputFolder = Objects.requireNonNull(CustomApplication.getInstance().getExternalFilesDir(null )).getAbsolutePath() + File.separator + "leo-audio" ; File folder = new File(outputFolder); if (!folder.exists()) { boolean succ = folder.mkdirs(); if (!succ) { CLog.e(TAG, "Can not create output file=%s" , outputFolder); } } File aacFile = new File(outputFolder, "received-original-" + SystemClock.elapsedRealtime() + ".aac" ); final String aacFilename = aacFile.getAbsolutePath(); try { mAacOs = new BufferedOutputStream(new FileOutputStream(aacFilename), 32 * 1024 ); } catch (Exception e) { CLog.e(TAG, e, "initOutputStream error." ); } } private void initAudioTrack (int sampleRate, int channelConfig, int audioFormat, int bufferSize) { mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate, channelConfig, audioFormat, bufferSize, AudioTrack.MODE_STREAM); mAudioTrack.play(); } private void initAudioDecoder (int sampleRate, byte [] csd0) { try { mAudioDecoder = MediaCodec.createDecoderByType(MediaFormat.MIMETYPE_AUDIO_AAC); MediaFormat mediaFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, sampleRate, 1 ); mediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC); mediaFormat.setInteger(MediaFormat.KEY_IS_ADTS, 1 ); ByteBuffer csd_0 = ByteBuffer.wrap(csd0); mediaFormat.setByteBuffer("csd-0" , csd_0); mAudioDecoder.configure(mediaFormat, null , null , 0 ); } catch (IOException e) { CLog.e(TAG, e, "initAudioDecoder error." ); } if (mAudioDecoder == null ) { CLog.e(TAG, "mAudioDecoder is null" ); return ; } mAudioDecoder.start(); } public void decodeAndPlay (byte [] audioData) { long st = SystemClock.elapsedRealtime(); try { MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo(); ByteBuffer inputBuffer; ByteBuffer outputBuffer; int inputIndex = mAudioDecoder.dequeueInputBuffer(-1 ); CLog.v(TAG, "inputIndex=%d" , inputIndex); if (inputIndex < 0 ) { return ; } inputBuffer = mAudioDecoder.getInputBuffer(inputIndex); if (inputBuffer != null ) { inputBuffer.clear(); inputBuffer.put(audioData); } mAudioDecoder.queueInputBuffer(inputIndex, 0 , audioData.length, AacEncoder.computePresentationTimeUs(presentationTimeUs.getAndIncrement()) , 0 ); int outputIndex = mAudioDecoder.dequeueOutputBuffer(bufferInfo, 0 ); CLog.d(TAG, "outputIndex=%d" , outputIndex); byte [] chunkPCM; while (outputIndex >= 0 ) { outputBuffer = mAudioDecoder.getOutputBuffer(outputIndex); chunkPCM = new byte [bufferInfo.size]; if (outputBuffer != null ) { outputBuffer.get(chunkPCM); } else { CLog.e(TAG, "outputBuffer is null" ); } if (chunkPCM.length > 0 ) { if (Constants.DEBUG_MODE) { CLog.d(TAG, "PCM data[%d]" , chunkPCM.length); } mAudioTrack.write(chunkPCM, 0 , chunkPCM.length); } mAudioDecoder.releaseOutputBuffer(outputIndex, false ); outputIndex = mAudioDecoder.dequeueOutputBuffer(bufferInfo, 0 ); } } catch (Exception e) { CLog.e(TAG, "You can ignore this message safely. decodeAndPlay error" ); } finally { long ed = SystemClock.elapsedRealtime(); CLog.d(TAG, "Decode=%dms" , ed - st); } } public void close () { try { if (mAudioTrack != null ) { mAudioTrack.stop(); } if (mAudioDecoder != null ) { mAudioDecoder.stop(); } } catch (Exception e) { CLog.e(TAG, e, "close error." ); } } public void release () { try { close(); if (mAudioTrack != null ) { mAudioTrack.stop(); mAudioTrack.release(); mAudioTrack = null ; } if (mAudioDecoder != null ) { mAudioDecoder.stop(); mAudioDecoder.release(); mAudioDecoder = null ; } } catch (Exception e) { CLog.e(TAG, e, "release error." ); } } public AudioTrack getAudioTrack () { return mAudioTrack; } public void setAudioTrack (AudioTrack mAudioTrack) { this .mAudioTrack = mAudioTrack; } public MediaCodec getAudioDecoder () { return mAudioDecoder; } public void setAudioDecoder (MediaCodec mAudioDecoder) { this .mAudioDecoder = mAudioDecoder; } }
Socket 相关源码 我使用的是 Netty 做为网络传递层。其优点太多,大家可以自行谷歌/百度。由于是演示用代码,因此代码量比较少也比较简单,就是 Netty 的基本用法。
TcpClient.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 import android.util.Log;import com.ho1ho.audioexample.utils.others.ByteUtil;import java.util.Arrays;import io.netty.bootstrap.Bootstrap;import io.netty.buffer.Unpooled;import io.netty.channel.Channel;import io.netty.channel.ChannelInitializer;import io.netty.channel.ChannelOption;import io.netty.channel.ChannelPipeline;import io.netty.channel.EventLoopGroup;import io.netty.channel.nio.NioEventLoopGroup;import io.netty.channel.socket.nio.NioSocketChannel;import io.netty.handler.logging.LogLevel;import io.netty.handler.logging.LoggingHandler;public class TcpClient { private static final String TAG = TcpClient.class.getSimpleName(); public static int PORT = 9999 ; public Bootstrap bootstrap; public Channel channel; public void initClient (String host) { bootstrap = getBootstrap(); channel = getChannel(host, PORT); } public Bootstrap getBootstrap () { EventLoopGroup group = new NioEventLoopGroup(); Bootstrap b = new Bootstrap(); b.group(group) .channel(NioSocketChannel.class); b.handler(new ChannelInitializer<Channel>() { @Override protected void initChannel (Channel ch) throws Exception { ChannelPipeline pipeline = ch.pipeline(); pipeline.addLast(new LoggingHandler(LogLevel.DEBUG)); pipeline.addLast("handler" , new TcpClientHandler()); } }); b.option(ChannelOption.SO_KEEPALIVE, true ); return b; } public Channel getChannel (String host, int port) { Channel channel = null ; try { channel = bootstrap.connect(host, port).sync().channel(); } catch (Exception e) { e.printStackTrace(); Log.e(TAG, String.format("Connect to Server(IP=%s, PORT=%d) failed." , host, port)); return null ; } return channel; } public void sendMsg (String msg) throws Exception { if (channel != null ) { channel.writeAndFlush(msg).sync(); } else { Log.w(TAG, "Send data failed. Channel is uninitialized." ); } } public void sendData (byte [] bytes) throws Exception { if (channel != null ) { byte [] all = ByteUtil.mergeBytes(ByteUtil.int2Bytes(bytes.length), bytes); Log.i(TAG, "Sending data [" + all.length + "|" + bytes.length + "]: " + Arrays.toString(all)); channel.writeAndFlush(Unpooled.wrappedBuffer(all)).sync(); } else { Log.w(TAG, "Channel Channel is uninitialized." ); } } }
TcpClientHandler.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 import android.util.Log;import io.netty.channel.ChannelHandlerContext;import io.netty.channel.SimpleChannelInboundHandler;public class TcpClientHandler extends SimpleChannelInboundHandler <Object > { private static final String TAG = TcpClientHandler.class.getSimpleName(); @Override protected void channelRead0 (ChannelHandlerContext ctx, Object msg) throws Exception { Log.w(TAG, "Client received msg from server: " + msg); } }
TcpServer.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 import android.util.Log;import io.netty.bootstrap.ServerBootstrap;import io.netty.channel.ChannelInitializer;import io.netty.channel.ChannelOption;import io.netty.channel.ChannelPipeline;import io.netty.channel.EventLoopGroup;import io.netty.channel.nio.NioEventLoopGroup;import io.netty.channel.socket.SocketChannel;import io.netty.channel.socket.nio.NioServerSocketChannel;import io.netty.handler.logging.LogLevel;import io.netty.handler.logging.LoggingHandler;public class TcpServer { private static final String TAG = TcpServer.class.getSimpleName(); private static final int PORT = 9999 ; protected static final int BOSS_GROUP_SIZE = Runtime.getRuntime().availableProcessors() * 2 ; protected static final int WORKER_GROUP_SIZE = 4 ; private static TcpServerHandler handler = new TcpServerHandler(); private static final EventLoopGroup bossGroup = new NioEventLoopGroup(BOSS_GROUP_SIZE); private static final EventLoopGroup workerGroup = new NioEventLoopGroup(WORKER_GROUP_SIZE); public static void run () throws Exception { ServerBootstrap b = new ServerBootstrap(); b.group(bossGroup, workerGroup); b.channel(NioServerSocketChannel.class); b.childOption(ChannelOption.SO_KEEPALIVE, true ); b.childHandler(new ChannelInitializer<SocketChannel>() { @Override public void initChannel (SocketChannel ch) throws Exception { ChannelPipeline pipeline = ch.pipeline(); pipeline.addLast(new LoggingHandler(LogLevel.DEBUG)); pipeline.addLast("messageDecoder" , new CustomDecoder()); pipeline.addLast(handler); } }); b.bind(PORT).sync(); Log.i(TAG, "Server started." ); } protected static void shutdown () { workerGroup.shutdownGracefully(); bossGroup.shutdownGracefully(); } public static TcpServerHandler getHandler () { return handler; } }
TcpServerHandler.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 import android.util.Log;import com.ho1ho.audioexample.MainActivity;import com.ho1ho.audioexample.utils.AacDecoder;import java.nio.charset.StandardCharsets;import io.netty.buffer.ByteBuf;import io.netty.buffer.ByteBufUtil;import io.netty.channel.ChannelHandlerContext;import io.netty.channel.SimpleChannelInboundHandler;public class TcpServerHandler extends SimpleChannelInboundHandler <Object > { private static final String TAG = TcpServerHandler.class.getSimpleName(); private AacDecoder mDecoder; public TcpServerHandler () { mDecoder = new AacDecoder(MainActivity.DEFAULT_SAMPLE_RATES, MainActivity.CHANNEL_OUT, MainActivity.DEFAULT_AUDIO_FORMAT, MainActivity.AUDIO_CSD_0); } @Override protected void channelRead0 (ChannelHandlerContext ctx, Object msg) throws Exception { ByteBuf bb = (ByteBuf) msg; byte [] audioData = ByteBufUtil.getBytes(bb); if ("START-AUDIO" .equals(new String(audioData, StandardCharsets.UTF_8))) { Log.e(TAG, "START-AUDIO" ); mDecoder.initOutputStream(); return ; } try { mDecoder.writeDataToDisk(audioData); mDecoder.decodeAndPlay(audioData); } catch (Exception e) { e.printStackTrace(); } } @Override public void exceptionCaught (ChannelHandlerContext ctx, Throwable cause) throws Exception { Log.w(TAG, "exceptionCaught!" , cause); ctx.close(); } }
CustomDecoder.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 import java.util.List;import io.netty.buffer.ByteBuf;import io.netty.channel.ChannelHandlerContext;import io.netty.handler.codec.ByteToMessageDecoder;public class CustomDecoder extends ByteToMessageDecoder { @Override protected void decode (ChannelHandlerContext ctx, ByteBuf in, List<Object> out) { int bufLen = in.readableBytes(); if (bufLen < 4 ) { return ; } in.markReaderIndex(); int dataLen = in.readInt(); if (in.readableBytes() < dataLen) { in.resetReaderIndex(); return ; } out.add(in.readBytes(dataLen)); } }
其它相关源码 说明: 由于是演示用代码,我并没有做动态权限申请,请知悉。大家测试时,需要手机授予个权限就行了。
MainActivity.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 import android.content.Intent;import android.media.AudioFormat;import android.os.Bundle;import android.os.Handler;import android.os.HandlerThread;import android.util.Log;import android.view.View;import android.widget.EditText;import androidx.appcompat.app.AppCompatActivity;import com.ho1ho.audioexample.others.Callback;import com.ho1ho.audioexample.tcp.TcpClient;import com.ho1ho.audioexample.tcp.TcpServer;import com.ho1ho.audioexample.utils.AudioPlay;import com.ho1ho.audioexample.utils.MicRecord;import com.ho1ho.audioexample.utils.others.PcmToWavUtil;import java.io.File;import java.io.IOException;import java.nio.charset.StandardCharsets;import java.util.Arrays;import java.util.Objects;public class MainActivity extends AppCompatActivity { private static final String TAG = MainActivity.class.getSimpleName(); private static final int [] SUPPORT_SAMPLE_RATES = {8000 , 11025 , 16000 , 22050 , 44100 , 48000 }; private static final int [] SUPPORT_BITRATES = {16000 , 32000 , 64000 , 96000 , 128000 , 192000 , 256000 }; public static final int DEFAULT_SAMPLE_RATES = SUPPORT_SAMPLE_RATES[2 ]; public static final int DEFAULT_BIT_RATES = SUPPORT_BITRATES[0 ]; public static final int DEFAULT_CHANNEL_COUNT = 1 ; public static final int DEFAULT_AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT; public static final int CHANNEL_IN = AudioFormat.CHANNEL_IN_MONO; public static final int CHANNEL_OUT = AudioFormat.CHANNEL_OUT_MONO; public static final byte [] AUDIO_CSD_0 = new byte []{(byte ) 0x14 , (byte ) 0x08 }; private MicRecord mMicRecord; private AudioPlay mAudioPlay; @Override protected void onCreate (Bundle savedInstanceState) { super .onCreate(savedInstanceState); setContentView(R.layout.activity_main); Log.e(TAG, "CacheDir =" + getCacheDir().getAbsoluteFile()); Log.e(TAG, "ExternalCacheDir=" + getExternalCacheDir().getAbsoluteFile()); Log.e(TAG, "FilesDir =" + getFilesDir().getAbsoluteFile()); Log.e(TAG, "ExternalFilesDir=" + getExternalFilesDir(null ).getAbsoluteFile()); } public void convertPcm2Wav () { String outputFolder = Objects.requireNonNull(getExternalFilesDir(null )).getAbsolutePath() + File.separator + "leo-audio" ; File file = new File(outputFolder, "original.pcm" ); File wavFile = new File(outputFolder, "original.wav" ); try { PcmToWavUtil.pcmToWav(file, wavFile, 1 , DEFAULT_SAMPLE_RATES, 16 ); } catch (IOException e) { e.printStackTrace(); } } public void onRecordClick (View view) { mMicRecord = new MicRecord(DEFAULT_SAMPLE_RATES, DEFAULT_BIT_RATES, DEFAULT_CHANNEL_COUNT, CHANNEL_IN, DEFAULT_AUDIO_FORMAT); sendAudioDataToServer("START-AUDIO" .getBytes(StandardCharsets.UTF_8)); mMicRecord.doRecord(new Callback() { @Override public void onCallback (byte [] aacAudioData) { Log.w(TAG, " Sending audio data[" + aacAudioData.length + "]: " + Arrays.toString(aacAudioData)); sendAudioDataToServer(aacAudioData); } }); } private void releaseAllResources () { if (mMicRecord != null ) { mMicRecord.stop(); } if (mAudioPlay != null ) { mAudioPlay.stop(); } } public void onStopClick (View view) { Log.i(TAG, "onStopClick" ); try { sendAudioDataToServer("END-OF-AUDIO" .getBytes(StandardCharsets.UTF_8)); } catch (Exception e) { e.printStackTrace(); } releaseAllResources(); convertPcm2Wav(); } public void onPlayAacClick (View view) { mAudioPlay = new AudioPlay(DEFAULT_SAMPLE_RATES, CHANNEL_OUT, DEFAULT_AUDIO_FORMAT, AUDIO_CSD_0); mAudioPlay.stop(); mAudioPlay.playAac(); } public void onPlayPcmClick (View view) { mAudioPlay = new AudioPlay(DEFAULT_SAMPLE_RATES, CHANNEL_OUT, DEFAULT_AUDIO_FORMAT, AUDIO_CSD_0); mAudioPlay.stop(); mAudioPlay.playPcm(); } public void initTcpServer () { new Handler().post(() -> { try { TcpServer.run(); } catch (Exception e) { e.printStackTrace(); } }); } public void onSendDataClick (View view) { sendAudioDataToServer("Hello World" .getBytes(StandardCharsets.UTF_8)); } private HandlerThread ht; private TcpClient client; public void sendAudioDataToServer (byte [] data) { if (client == null ) { Log.e(TAG, "Client is null" ); return ; } new Handler(ht.getLooper()).post(() -> { try { client.sendData(data); } catch (Exception e) { e.printStackTrace(); } }); } public void onStartServerClick (View view) { initTcpServer(); } public void onConnectServerClick (View view) { client = new TcpClient(); String serverIp = ((EditText) findViewById(R.id.etSvrIp)).getText().toString(); client.initClient(serverIp); ht = new HandlerThread("send-data" ); ht.start(); Log.w(TAG, "Connected to server." ); } public void onCaptureClick (View view) { startActivity(new Intent(this , CaptureImage.class)); } }
PcmToWavUtil.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 import java.io.File;import java.io.FileInputStream;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;import java.io.OutputStream;public class PcmToWavUtil { private static final int TRANSFER_BUFFER_SIZE = 10 * 1024 ; public static byte [] pcmToWav(byte [] pcmData, int numChannels, int sampleRate, int bitPerSample) { byte [] wavData = new byte [pcmData.length + 44 ]; byte [] header = wavHeader(pcmData.length, numChannels, sampleRate, bitPerSample); System.arraycopy(header, 0 , wavData, 0 , header.length); System.arraycopy(pcmData, 0 , wavData, header.length, pcmData.length); return wavData; } public static byte [] wavHeader(int pcmLen, int numChannels, int sampleRate, int bitPerSample) { byte [] header = new byte [44 ]; header[0 ] = 'R' ; header[1 ] = 'I' ; header[2 ] = 'F' ; header[3 ] = 'F' ; long chunkSize = pcmLen + 36 ; header[4 ] = (byte ) (chunkSize & 0xff ); header[5 ] = (byte ) ((chunkSize >> 8 ) & 0xff ); header[6 ] = (byte ) ((chunkSize >> 16 ) & 0xff ); header[7 ] = (byte ) ((chunkSize >> 24 ) & 0xff ); header[8 ] = 'W' ; header[9 ] = 'A' ; header[10 ] = 'V' ; header[11 ] = 'E' ; header[12 ] = 'f' ; header[13 ] = 'm' ; header[14 ] = 't' ; header[15 ] = ' ' ; header[16 ] = 16 ; header[17 ] = 0 ; header[18 ] = 0 ; header[19 ] = 0 ; header[20 ] = 1 ; header[21 ] = 0 ; header[22 ] = (byte ) numChannels; header[23 ] = 0 ; header[24 ] = (byte ) (sampleRate & 0xff ); header[25 ] = (byte ) ((sampleRate >> 8 ) & 0xff ); header[26 ] = (byte ) ((sampleRate >> 16 ) & 0xff ); header[27 ] = (byte ) ((sampleRate >> 24 ) & 0xff ); long byteRate = sampleRate * numChannels * bitPerSample / 8 ; header[28 ] = (byte ) (byteRate & 0xff ); header[29 ] = (byte ) ((byteRate >> 8 ) & 0xff ); header[30 ] = (byte ) ((byteRate >> 16 ) & 0xff ); header[31 ] = (byte ) ((byteRate >> 24 ) & 0xff ); header[32 ] = (byte ) (numChannels * bitPerSample / 8 ); header[33 ] = 0 ; header[34 ] = (byte ) bitPerSample; header[35 ] = 0 ; header[36 ] = 'd' ; header[37 ] = 'a' ; header[38 ] = 't' ; header[39 ] = 'a' ; header[40 ] = (byte ) (pcmLen & 0xff ); header[41 ] = (byte ) ((pcmLen >> 8 ) & 0xff ); header[42 ] = (byte ) ((pcmLen >> 16 ) & 0xff ); header[43 ] = (byte ) ((pcmLen >> 24 ) & 0xff ); return header; } static public void pcmToWav (File input, File output, int channelCount, int sampleRate, int bitsPerSample) throws IOException { final int inputSize = (int ) input.length(); try (OutputStream encoded = new FileOutputStream(output)) { writeToOutput(encoded, "RIFF" ); writeToOutput(encoded, 36 + inputSize); writeToOutput(encoded, "WAVE" ); writeToOutput(encoded, "fmt " ); writeToOutput(encoded, 16 ); writeToOutput(encoded, (short ) 1 ); writeToOutput(encoded, (short ) channelCount); writeToOutput(encoded, sampleRate); writeToOutput(encoded, sampleRate * channelCount * bitsPerSample / 8 ); writeToOutput(encoded, (short ) (channelCount * bitsPerSample / 8 )); writeToOutput(encoded, (short ) bitsPerSample); writeToOutput(encoded, "data" ); writeToOutput(encoded, inputSize); copy(new FileInputStream(input), encoded); } } public static void writeToOutput (OutputStream output, String data) throws IOException { for (int i = 0 ; i < data.length(); i++) output.write(data.charAt(i)); } public static void writeToOutput (OutputStream output, int data) throws IOException { output.write(data & 0xff ); output.write(data >> 8 & 0xff ); output.write(data >> 16 & 0xff ); output.write(data >> 24 & 0xff ); } public static void writeToOutput (OutputStream output, short data) throws IOException { output.write(data & 0xff ); output.write(data >> 8 & 0xff ); } public static long copy (InputStream source, OutputStream output) throws IOException { return copy(source, output, TRANSFER_BUFFER_SIZE); } public static long copy (InputStream source, OutputStream output, int bufferSize) throws IOException { long read = 0L ; byte [] buffer = new byte [bufferSize]; for (int n; (n = source.read(buffer)) != -1 ; read += n) { output.write(buffer, 0 , n); } return read; } }
AudioPlay.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 package com.ho1ho.audioexample.utils;import android.media.AudioManager;import android.media.AudioTrack;import android.media.MediaCodec;import android.media.MediaExtractor;import android.media.MediaFormat;import android.util.Log;import com.ho1ho.audioexample.CustomApplication;import java.io.BufferedInputStream;import java.io.File;import java.io.FileInputStream;import java.io.IOException;import java.nio.ByteBuffer;import java.util.Arrays;import java.util.Objects;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;public class AudioPlay { private static final String TAG = AudioPlay.class.getSimpleName(); private ExecutorService mThreadPool = Executors.newFixedThreadPool(1 ); private static int mTrackBufferSize; private AudioTrack mAudioTrack; private MediaCodec mAudioDecoder; private MediaExtractor mMediaExtractor; private boolean mIsPlaying; private int mSampleRate; private int mChannelMask; private int mAudioFormat; private byte [] mCsd0; public AudioPlay (int sampleRate, int channelMask, int audioFormat, byte [] csd0) { this (sampleRate, channelMask, audioFormat, AudioTrack.getMinBufferSize(sampleRate, channelMask, audioFormat) * 2 , csd0); } public AudioPlay (int sampleRate, int channelMask, int audioFormat, int trackBufferSize, byte [] csd0) { Log.e(TAG, "trackBufferSize=" + trackBufferSize); mSampleRate = sampleRate; mChannelMask = channelMask; mAudioFormat = audioFormat; mTrackBufferSize = trackBufferSize; mCsd0 = csd0; } private void initAudioTrack () { mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, mSampleRate, mChannelMask, mAudioFormat, mTrackBufferSize, AudioTrack.MODE_STREAM); mAudioTrack.play(); } private void initAudioDecoder () { try { String folder = Objects.requireNonNull(CustomApplication.instance.getExternalFilesDir(null )).getAbsolutePath() + File.separator + "leo-audio" ; File mFilePath = new File(folder, "original.aac" ); mMediaExtractor = new MediaExtractor(); mMediaExtractor.setDataSource(mFilePath.getAbsolutePath()); MediaFormat format = mMediaExtractor.getTrackFormat(0 ); String mime = format.getString(MediaFormat.KEY_MIME); if (mime.startsWith("audio" )) { mMediaExtractor.selectTrack(0 ); mAudioDecoder = MediaCodec.createDecoderByType(mime); format = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, mSampleRate, 1 ); ByteBuffer csd_0 = ByteBuffer.wrap(mCsd0); format.setByteBuffer("csd-0" , csd_0); mAudioDecoder.configure(format, null , null , 0 ); } else { return ; } } catch (IOException e) { e.printStackTrace(); } if (mAudioDecoder == null ) { Log.e("MA" , "mAudioDecoder is null" ); return ; } mAudioDecoder.start(); } public void playPcm () { mIsPlaying = true ; initAudioTrack(); String folder = Objects.requireNonNull(CustomApplication.instance.getExternalFilesDir(null )).getAbsolutePath() + File.separator + "leo-audio" ; File pcmFile = new File(folder, "original.pcm" ); mThreadPool.execute(new Runnable() { @Override public void run () { byte [] pcmData = new byte [mTrackBufferSize]; try (BufferedInputStream is = new BufferedInputStream(new FileInputStream(pcmFile))) { while (true ) { int readSize = is.read(pcmData, 0 , pcmData.length); if (readSize > 0 ) { mAudioTrack.write(pcmData, 0 , pcmData.length); } else { break ; } } } catch (Exception e) { e.printStackTrace(); } } }); } public void playAac () { mIsPlaying = true ; initAudioDecoder(); initAudioTrack(); mThreadPool.execute(new Runnable() { @Override public void run () { try { boolean isFinish = false ; MediaCodec.BufferInfo decodeBufferInfo = new MediaCodec.BufferInfo(); while (!isFinish && mIsPlaying) { int inputIndex = mAudioDecoder.dequeueInputBuffer(10_000 ); Log.w(TAG, "inputIndex=" + inputIndex); if (inputIndex < 0 ) { isFinish = true ; } ByteBuffer inputBuffer = mAudioDecoder.getInputBuffer(inputIndex); inputBuffer.clear(); int sampleSize = mMediaExtractor.readSampleData(inputBuffer, 0 ); byte [] sampleData = new byte [inputBuffer.remaining()]; inputBuffer.get(sampleData); Log.i(TAG, "Sample aac data[" + sampleData.length + "]=" + Arrays.toString(sampleData)); if (sampleSize > 0 ) { mAudioDecoder.queueInputBuffer(inputIndex, 0 , sampleSize, 0 , 0 ); mMediaExtractor.advance(); } else { mAudioDecoder.queueInputBuffer(inputIndex, 0 , 0 , 0 , MediaCodec.BUFFER_FLAG_END_OF_STREAM); isFinish = true ; } int outputIndex = mAudioDecoder.dequeueOutputBuffer(decodeBufferInfo, 10_000 ); Log.e(TAG, "outputIndex=" + outputIndex); ByteBuffer outputBuffer; byte [] chunkPCM; while (outputIndex >= 0 ) { outputBuffer = mAudioDecoder.getOutputBuffer(outputIndex); chunkPCM = new byte [decodeBufferInfo.size]; outputBuffer.get(chunkPCM); outputBuffer.clear(); if (chunkPCM.length > 0 ) { Log.i(TAG, "PCM data[" + chunkPCM.length + "]=" + Arrays.toString(chunkPCM)); mAudioTrack.write(chunkPCM, 0 , decodeBufferInfo.size); } mAudioDecoder.releaseOutputBuffer(outputIndex, false ); outputIndex = mAudioDecoder.dequeueOutputBuffer(decodeBufferInfo, 10_000 ); } } } catch (Exception e) { e.printStackTrace(); } finally { stop(); } } }); } public void stop () { mIsPlaying = false ; if (mAudioDecoder != null ) { try { mAudioDecoder.stop(); } catch (Exception e) { e.printStackTrace(); } } if (mAudioTrack != null ) { try { mAudioTrack.stop(); } catch (Exception e) { e.printStackTrace(); } } } public void release () { stop(); if (mAudioDecoder != null ) { try { mAudioDecoder.release(); } catch (Exception e) { e.printStackTrace(); } mAudioDecoder = null ; } if (mAudioTrack != null ) { try { mAudioTrack.release(); } catch (Exception e) { e.printStackTrace(); } mAudioTrack = null ; } if (mMediaExtractor != null ) { mMediaExtractor.release(); mMediaExtractor = null ; } if (mThreadPool != null ) { mThreadPool.shutdownNow(); mThreadPool = null ; } } public boolean isPlaying () { return mIsPlaying; } }
ByteUtil.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 package com.ho1ho.audioexample.utils.others;import java.io.ByteArrayInputStream;import java.io.ByteArrayOutputStream;import java.io.IOException;import java.io.InputStream;import java.io.UnsupportedEncodingException;import java.nio.ByteBuffer;public class ByteUtil { private static ByteBuffer buffer = ByteBuffer.allocate(8 ); public static byte int2Byte (int x) { return (byte ) x; } public static byte [] byte2Bytes(byte b) { return new byte []{b}; } public static byte [] intAsByteAndForceToBytes(int val) { return byte2Bytes(int2Byte(val)); } public static int byte2Int (byte b) { return b & 0xFF ; } public static int bytes2Int (byte [] b) { return b[3 ] & 0xFF | (b[2 ] & 0xFF ) << 8 | (b[1 ] & 0xFF ) << 16 | (b[0 ] & 0xFF ) << 24 ; } public static int bytes2IntLE (byte [] b) { return (b[3 ] & 0xFF ) << 24 | (b[2 ] & 0xFF ) << 16 | (b[1 ] & 0xFF ) << 8 | (b[0 ] & 0xFF ) << 0 ; } public static int bytes2Int (byte [] b, int index) { return b[index + 3 ] & 0xFF | (b[index + 2 ] & 0xFF ) << 8 | (b[index + 1 ] & 0xFF ) << 16 | (b[index + 0 ] & 0xFF ) << 24 ; } public static int bytes2IntLE (byte [] b, int index) { return b[index] & 0xFF | (b[index + 1 ] & 0xFF ) << 8 | (b[index + 2 ] & 0xFF ) << 16 | (b[index + 3 ] & 0xFF ) << 24 ; } public static byte [] int2Bytes(int a) { return new byte []{ (byte ) ((a >> 24 ) & 0xFF ), (byte ) ((a >> 16 ) & 0xFF ), (byte ) ((a >> 8 ) & 0xFF ), (byte ) (a & 0xFF ) }; } public static byte [] intLE2Bytes(int a) { return new byte []{ (byte ) ((a >> 0 ) & 0xFF ), (byte ) ((a >> 8 ) & 0xFF ), (byte ) ((a >> 16 ) & 0xFF ), (byte ) ((a >> 24 ) & 0xFF ) }; } public static void bytes2Short (byte [] b, short s, int index) { b[index + 1 ] = (byte ) (s >> 8 ); b[index + 0 ] = (byte ) (s >> 0 ); } public static void bytes2ShortLE (byte [] b, short s, int index) { b[index + 0 ] = (byte ) (s >> 8 ); b[index + 1 ] = (byte ) (s >> 0 ); } public static short bytes2Short (byte [] b, int index) { return (short ) (((b[index + 0 ] << 8 ) | b[index + 1 ] & 0xff )); } public static short bytes2ShortLE (byte [] b, int index) { return (short ) (((b[index + 1 ] << 8 ) | b[index + 0 ] & 0xff )); } public static byte [] short2Bytes(short s) { byte [] targets = new byte [2 ]; for (int i = 0 ; i < 2 ; i++) { int offset = (targets.length - 1 - i) * 8 ; targets[i] = (byte ) ((s >>> offset) & 0xff ); } return targets; } public static byte [] shortLE2Bytes(short s) { byte [] targets = new byte [2 ]; for (int i = 0 ; i < 2 ; i++) { targets[i] = (byte ) ((s >>> i * 8 ) & 0xff ); } return targets; } public static short bytes2Short (byte [] b) { return bytes2Short(b, 0 ); } public static short bytes2ShortLE (byte [] b) { return bytes2ShortLE(b, 0 ); } public static byte [] long2Bytes(long x) { buffer.putLong(0 , x); return buffer.array(); } public static long bytes2Long (byte [] bytes) { buffer.put(bytes, 0 , bytes.length); buffer.flip(); return buffer.getLong(); } public static byte [] getBytes(byte [] data, int start, int end) { byte [] ret = new byte [end - start]; for (int i = 0 ; (start + i) < end; i++) { ret[i] = data[start + i]; } return ret; } public static byte [] readInputStream(InputStream inStream) { ByteArrayOutputStream outStream = null ; try { outStream = new ByteArrayOutputStream(); byte [] buffer = new byte [1024 ]; byte [] data = null ; int len = 0 ; while ((len = inStream.read(buffer)) != -1 ) { outStream.write(buffer, 0 , len); } data = outStream.toByteArray(); return data; } catch (IOException e) { return null ; } finally { try { if (outStream != null ) { outStream.close(); } if (inStream != null ) { inStream.close(); } } catch (IOException e) { return null ; } } } public static InputStream readByteArr (byte [] b) { return new ByteArrayInputStream(b); } public static boolean isEqual (byte [] s1, byte [] s2) { int slen = s1.length; if (slen == s2.length) { for (int index = 0 ; index < slen; index++) { if (s1[index] != s2[index]) { return false ; } } return true ; } return false ; } public static String getString (byte [] s1, String encode, String err) { try { return new String(s1, encode); } catch (UnsupportedEncodingException e) { return err; } } public static String getString (byte [] s1, String encode) { return getString(s1, encode, null ); } public static String bytes2HexString (byte [] b) { String result = "" ; for (int i = 0 ; i < b.length; i++) { result += Integer.toString((b[i] & 0xff ) + 0x100 , 16 ).substring(1 ); } return result; } public static int hexString2Int (String hexString) { return Integer.parseInt(hexString, 16 ); } public static String int2Binary (int i) { return Integer.toBinaryString(i); } public static byte [] mergeByte(byte ... bs) { byte [] result = new byte [bs.length]; System.arraycopy(bs, 0 , result, 0 , result.length); return result; } public static byte [] mergeBytes(byte []... byteList) { int lengthByte = 0 ; for (byte [] bytes : byteList) { lengthByte += bytes.length; } byte [] allBytes = new byte [lengthByte]; int countLength = 0 ; for (byte [] b : byteList) { System.arraycopy(b, 0 , allBytes, countLength, b.length); countLength += b.length; } return allBytes; } public static void main (String[] args) { System.err.println(isEqual(new byte []{1 , 2 }, new byte []{1 , 2 })); System.err.println(bytes2HexString(new byte []{0 , 0 , 1 , 0 })); System.err.println(bytes2Int(new byte []{0 , 0 , 1 , 0 })); System.err.println(bytes2IntLE(new byte []{0 , 0 , 1 , 0 })); } }
AndroidManifest.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 <?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android ="http://schemas.android.com/apk/res/android" xmlns:tools ="http://schemas.android.com/tools" package ="com.ho1ho.audioexample" > <uses-permission android:name ="android.permission.INTERNET" /> <uses-permission android:name ="android.permission.WRITE_EXTERNAL_STORAGE" /> <uses-permission android:name ="android.permission.READ_EXTERNAL_STORAGE" /> <uses-permission android:name ="android.permission.RECORD_AUDIO" /> <application android:name =".CustomApplication" android:allowBackup ="false" android:icon ="@mipmap/ic_launcher" android:label ="@string/app_name" android:roundIcon ="@mipmap/ic_launcher_round" android:supportsRtl ="true" android:theme ="@style/AppTheme" tools:ignore ="GoogleAppIndexingWarning" > <activity android:name =".MainActivity" > <intent-filter > <action android:name ="android.intent.action.MAIN" /> <category android:name ="android.intent.category.LAUNCHER" /> </intent-filter > </activity > </application > </manifest >
关于 CSD (Codec Specific Data) 根据 AAC Profile,采样率和声道数计算 CSD CSD(Codec Specific Data) 一共占用两个字节,共计 16 位二进制。 计算 CSD 需要用到三个必要参数:AAC Profile(占用 5 bits),采样率(占用 4 bits)和声道数(占用 4 bits),还有预留关键字(占用 3 bits),其值为0。 例如:音频参数 AAC-LC,44.1Khz,Mono。对应的十进制值是 2, 4, 1(详见上文 AacEncoder.java
中的注释)。 转换成二进制值是:10, 100, 1 再转换成它们各自应该占用的位数:00010,0100,0001,000 重新将它们排列成 4 位一组:0001,0010,0000,1000 再将其转换成对应的十六进制: 0x1,0x2,0x0,0x8 从而得到 CSD 值为:0x12,0x08
1 2 3 4 5 6 7 8 9 10 11 12 public byte [] getAudioEncodingCsd0(int aacProfile, int sampleRate, int channelCount) { int freqIdx = getSampleFrequencyIndex(sampleRate); ByteBuffer csd = ByteBuffer.allocate(2 ); csd.put(0 , (byte ) (aacProfile << 3 | freqIdx >> 1 )); csd.put(1 , (byte ) ((freqIdx & 0x01 ) << 7 | channelCount << 3 )); byte [] csd0 = new byte [2 ]; csd.get(csd0); csd.clear(); return csd0; }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 public int getSampleFrequencyIndex (int sampleRate) { switch (sampleRate) { case 7350 : return 12 ; case 8000 : return 11 ; case 11025 : return 10 ; case 12000 : return 9 ; case 16000 : return 8 ; case 22050 : return 7 ; case 24000 : return 6 ; case 32000 : return 5 ; case 44100 : return 4 ; case 48000 : return 3 ; case 64000 : return 2 ; case 88200 : return 1 ; case 96000 : return 0 ; default : return -1 ; } }
根据 CSD 计算 AAC Profile,采样率和声道数 1 2 3 int profile = (csd[0 ] >> 3 ) & 0x1F ; int freqIdx = ((csd[0 ] & 0x7 ) << 1 ) | ((csd[1 ] >> 7 ) & 0x1 ); int chanCfg = (csd[1 ] >> 3 ) & 0xF ;
其它说明 关于上述演示代码的说明 上述代码是我最早写的一个 Demo 代码,并未做优化,而且录音和解码放到了同一线程。在做单机测试时不会有问题,不过做多机实时通话时,会有 1 秒左右的延迟。下文会对延迟是如何产生的进行详细的说明。
注意: 在实际的项目中,我已经对上述代码进行了优化和重构,并且将录音,编码,发送数据放到了不同的线程来处理。但是由于种种原因不方便贴出修改后的代码,望大家谅解。不过大家可以自行优化一下。
发送实时语音的延迟问题 起初在实际项目中,测试双向实时通话功能时,发现大约有 1 秒多的延迟,经过调查找到了产生延迟的原因。
以微信实时语音的配置为例,采样率 8000,双声道,采样精度 16 位的情况下,通过 AudioRecord.read()
方法获取 PCM 数据时,每次取出的 PCM 长度为 1280 个字节,首次获取数据所需要的时间最长,大约 250 ms,以后每次获取 PCM 数据耗时 40ms 左右(三星 Galaxy S7 edge-G9350)。如果再按如下配置编码成 AAC 的话:采样率 8000,码率 16000,双声道,AAC 规格 (AAC Profile)为 AAC-LC,AudioRecord.getMinBufferSize
值为 1280
个字节,大约每获取三次 PCM 数据,才能成功编码出一次 AAC 数据,编码出的 AAC 数据长度大约是 260 个字节左右。也就是说大约每 120 ms,才能生成一次 AAC 数据。
将 AAC 数据实时传递给另一手机进行编码成 PCM 并播放时(A 手机只要编码成了 AAC 数据就实时发给 B 手机),大约每接收到 6 次 AAC 数据才能播放出声音。也就是说至少要录音 18 次所生成的 PCM 数据编码成 AAC 后才能用于播放,18 次录音大约总耗时 250 + 17*40 = 930 ms。最后再加上双方的编解码时间(使用 MediaCodec 对 PCM 进行编解码的时间都很短,并不会产生明显的性能问题)和网络传输时间(测试网络为内网,网速极佳),实测产生约 1.2 秒左右的延迟。
到这里就比较清楚了,由于录音次数过多导致了延迟。于是我又做了另一个实验,不对 PCM 进行编码,而且直接发送 PCM 数据的话,实测时,延迟一下子就降到了 500ms 左右。不过由于没有对 PCM 编码,导致每次发送的数据量增大了很多(之前 PCM 与 AAC 的压缩比为 16 倍左右)。之后要解决的问题就是如何使用其它方式来压缩 PCM 数据了。
吐槽 这里我必须吐下槽某些人。我之前在调查时,就看到有些人评论别人的文章。看到别人发的代码里有些日志类等没有发,就会问“为什么 Copy 代码后无法编译啊?”大哥,每个人的日志类等一些无关业务的代码,基本都会封装下,想编译过,你不会自己简单改一下啊。人家文章里代码明明就写的已经非常的详细了,只要把代码 Copy 到项目里,真的只要简单修改下日志等最基本的东西代码就可以用,非得向人家要整个工程。非得给你发完整工程你才会用吗?要是这样的话,那你还看别人代码干什么。再说了,就算给你发完整工程,要是 gradle 版本你没有,SDK 版本你没有,是不是你也得问为什么代码报错啊无法编译啊。
实在看不惯这样的人,你要是懒到都不愿意 Copy 代码或者是真的不知道应该改哪,那在你发问前还是先好好学习下 Android 基础开发吧。)
我们看别人的文章时,应该是在找灵感,知道了自己少做了什么,应该做什么。而不是上来就要现成的。人家不是你的老师不是你的父母,没有任何义务把所有东西都给你。人家把最主要的知识教给你就行了,举一反三就得要靠自己了。就像考试一样,不是试卷上的每一道题你之前都做过,要是有你没见过的类型,难道你还会告诉老师自己不及格的原因是试卷上的题你没有给我讲过吗?
参考文献