Skip to content
This repository has been archived by the owner on Aug 3, 2021. It is now read-only.

demo_streaming_asr.py, AssertionError #532

Open
HunbeomBak opened this issue Apr 1, 2020 · 0 comments
Open

demo_streaming_asr.py, AssertionError #532

HunbeomBak opened this issue Apr 1, 2020 · 0 comments

Comments

@HunbeomBak
Copy link

HunbeomBak commented Apr 1, 2020

Please understand that my English skill is not good.

I tested the microphone demo.

The model was trained with my dataset, and interactive_infer parameter was also added to the config file.
interactive_infer_params = {
"data_layer": Speech2TextDataLayer,
"data_layer_params": {
"num_audio_features": 64,
"input_type": "logfbank",
"vocab_file": "open_seq2seq/test_utils/toy_speech_data/vocab.txt",
"dataset_files": [],
"shuffle": False,
},
}

Openseq2seq was installed by Docker, and learning and infer were executed without problems.
i used jasper-Mini-for-Jetson.py as config file for training and infering.

Below is a copy of the executed result.

root@f31b402db666:/data/ASR/OpenSeq2Seq# python demo_streaming_asr.py
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_route.c:867:(find_matching_chmap) Found no matching channel map
Available audio input devices:
0 HDA Intel PCH: ALC1220 Analog (hw:0,0)
2 HDA Intel PCH: ALC1220 Alt Analog (hw:0,2)
11 Microsoft® LifeCam HD-3000: USB Audio (hw:3,0)
12 sysdefault
22 default
Please type input device ID:
11

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

*** Restoring from the latest checkpoint
*** Inference config:
{'batch_size_per_gpu': 1,
'data_layer': <class 'open_seq2seq.data.speech2text.speech2text.Speech2TextDataLayer'>,
'data_layer_params': {'backend': 'librosa',
'dataset_files': [],
'dither': 1e-05,
'input_type': 'logfbank',
'norm_per_feature': True,
'num_audio_features': 64,
'pad_to': 0,
'precompute_mel_basis': True,
'sample_freq': 16000,
'shuffle': False,
'vocab_file': 'open_seq2seq/test_utils/toy_speech_data/vocab.txt',
'window': 'hanning'},
'decoder': <class 'open_seq2seq.decoders.fc_decoders.FullyConnectedCTCDecoder'>,
'decoder_params': {'infer_logits_to_pickle': True,
'initializer': <function xavier_initializer at 0x7f926707ea60>,
'use_language_model': False},
'dtype': tf.float32,
'encoder': <class 'open_seq2seq.encoders.tdnn_encoder.TDNNEncoder'>,
'encoder_params': {'activation_fn': <function relu at 0x7f913e994730>,
'convnet_layers': [{'dilation': [1],
'kernel_size': [11],
'num_channels': 256,
'padding': 'SAME',
'repeat': 1,
'stride': [2],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [11],
'num_channels': 256,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [11],
'num_channels': 256,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [13],
'num_channels': 256,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [13],
'num_channels': 256,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [17],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [17],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [21],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [21],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [25],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [25],
'num_channels': 512,
'padding': 'SAME',
'repeat': 3,
'residual': True,
'residual_dense': False,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [2],
'kernel_size': [29],
'num_channels': 512,
'padding': 'SAME',
'repeat': 1,
'stride': [1],
'type': 'sep_conv1d'},
{'dilation': [1],
'kernel_size': [1],
'num_channels': 1024,
'padding': 'SAME',
'repeat': 1,
'stride': [1],
'type': 'sep_conv1d'}],
'data_format': 'channels_last',
'dropout_keep_prob': 1.0,
'initializer': <function xavier_initializer at 0x7f926707ea60>,
'initializer_params': {'uniform': False},
'normalization': 'batch_norm',
'use_conv_mask': True},
'eval_steps': 2200,
'iter_size': 1,
'larc_params': {'larc_eta': 0.001},
'logdir': '/data2/model/20200330_LDC_ATCOSIM_mini',
'loss': <class 'open_seq2seq.losses.ctc_loss.CTCLoss'>,
'loss_params': {},
'lr_policy': <function poly_decay at 0x7f9254fa3d90>,
'lr_policy_params': {'learning_rate': 0.02, 'min_lr': 1e-05, 'power': 2.0},
'num_checkpoints': 2,
'num_epochs': 100,
'num_gpus': 1,
'optimizer': <class 'open_seq2seq.optimizers.novograd.NovoGrad'>,
'optimizer_params': {'beta1': 0.95,
'beta2': 0.98,
'epsilon': 1e-08,
'grad_averaging': False,
'weight_decay': 0.001},
'print_loss_steps': 100,
'print_samples_steps': 2200,
'random_seed': 0,
'save_checkpoint_steps': 1100,
'save_summaries_steps': 100,
'summaries': ['learning_rate',
'variables',
'gradients',
'larc_summaries',
'variable_norm',
'gradient_norm',
'global_gradient_norm'],
'use_horovod': False,
'use_xla_jit': False}
*** Building graph on GPU:0
WARNING:tensorflow:From /data/ASR/OpenSeq2Seq/open_seq2seq/parts/cnns/conv_blocks.py:192: separable_conv1d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.separable_conv1d instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /data/ASR/OpenSeq2Seq/open_seq2seq/parts/cnns/conv_blocks.py:223: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.batch_normalization instead.
WARNING:tensorflow:From /data/ASR/OpenSeq2Seq/open_seq2seq/encoders/tdnn_encoder.py:255: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use rate instead of keep_prob. Rate should be set to rate = 1 - keep_prob.
WARNING:tensorflow:From /data/ASR/OpenSeq2Seq/open_seq2seq/decoders/fc_decoders.py:139: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
*** Inference Mode. Loss part of graph isn't built.
2020-04-01 05:06:12.246228: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3300000000 Hz
2020-04-01 05:06:12.248433: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x7ddb280 executing computations on platform Host. Devices:
2020-04-01 05:06:12.248487: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): ,
2020-04-01 05:06:12.455382: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x7de1000 executing computations on platform CUDA. Devices:
2020-04-01 05:06:12.455442: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): TITAN Xp, Compute Capability 6.1
2020-04-01 05:06:12.455462: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (1): TITAN Xp, Compute Capability 6.1
2020-04-01 05:06:12.456070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:67:00.0
totalMemory: 11.91GiB freeMemory: 191.50MiB
2020-04-01 05:06:12.456185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:68:00.0
totalMemory: 11.91GiB freeMemory: 173.38MiB
2020-04-01 05:06:12.456579: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2020-04-01 05:06:13.345461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-01 05:06:13.345502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1
2020-04-01 05:06:13.345508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N Y
2020-04-01 05:06:13.345513: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: Y N
2020-04-01 05:06:13.349140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 121 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:67:00.0, compute capability: 6.1)
2020-04-01 05:06:13.349769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 103 MB memory) -> physical GPU (device: 1, name: TITAN Xp, pci bus id: 0000:68:00.0, compute capability: 6.1)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/decorator_utils.py:145: GraphKeys.VARIABLES (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.GraphKeys.GLOBAL_VARIABLES instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Initialization was successful
Traceback (most recent call last):
File "demo_streaming_asr.py", line 28, in callback
pred = asr.transcribe(signal)
File "/data/ASR/OpenSeq2Seq/frame_asr.py", line 237, in transcribe
return self._decode(frame, self.offset, self.merge)
File "/data/ASR/OpenSeq2Seq/frame_asr.py", line 195, in _decode
assert len(frame)==self.n_frame_len
AssertionError
Traceback (most recent call last):
File "demo_streaming_asr.py", line 45, in
time.sleep(0.1)
AssertionError
root@f31b402db666:/data/ASR/OpenSeq2Seq#

I tried to find the cause of the problem.

On frame_asr.py,
`
def transcribe(self, frame=None):
print(np.shape(frame))
print(self.n_frame_len)

    if frame is None:
        frame = np.zeros(shape=self.n_frame_len, dtype=np.float32)
    if len(frame) < self.n_frame_len:
        frame = np.pad(frame, [0, self.n_frame_len - len(frame)], 'constant')
    return self._decode(frame, self.offset, self.merge)

`

result :
(32000,)
3200

it look like shape miss-match.

how can i solve this problem???

@HunbeomBak HunbeomBak changed the title AssertionError demo_streaming_asr.py, AssertionError Apr 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant