2024/05/14 更新

写真a

リク ケイビン
陸 慧敏
LU Huimin
Scopus 論文情報  
総論文数: 0  総Citation: 0  h-index: 46

Citation Countは当該年に発表した論文の被引用数

所属
大学院工学研究院 機械知能工学研究系
職名
准教授
メールアドレス
メールアドレス
ホームページ
外部リンク

研究キーワード

  • ロボティックス

  • 海中光学

  • コンピュータビジョン

  • 人工知能

研究分野

  • ものづくり技術(機械・電気電子・化学工学) / 計測工学

  • 情報通信 / 知覚情報処理

取得学位

  • 九州工業大学  -  博士(工学)   2014年03月

学内職務経歴

  • 2019年09月 - 現在   九州工業大学   大学院工学研究院   機械知能工学研究系     准教授

所属学会・委員会

  • 2020年04月 - 現在   電子情報通信学会   日本国

  • 2019年08月 - 現在   SPIE   アメリカ合衆国

  • 2012年01月 - 現在   IEEE   アメリカ合衆国

論文

  • Multiscale Shared Learning for Fault Diagnosis of Rotating Machinery in Transportation Infrastructures 査読有り

    Chen Z., Tian S., Shi X., Lu H.

    IEEE Transactions on Industrial Informatics   19 ( 1 )   447 - 458   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Rotating machinery is ubiquitous, and its failures constitute a major cause of the failures of transportation infrastructures. Most fault-diagnosis methods for rotating machinery are based on vibration-signal analysis because vibrations directly reflect the transient regime of machinery elements. This article proposes a novel multiscale shared-learning network (MSSLN) architecture to extract and classify the fault features inherent to multiscale factors of vibration signals. The architecture fuses layer-wise activations with multiscale flows, to enable the network to fully learn the shared representation with consistency across multiscale factors. This characteristic helps MSSLN provide more faithful diagnoses than existing single-and multiscale methods. Experiments on bearing and gearbox datasets are used to evaluate the fault-diagnosis performance of transportation infrastructures. Extensive experimental results and comprehensive analyses demonstrate the superiority of the proposed MSSLN in fault diagnosis for bearings and gearboxes, the two foundational elements in transportation infrastructures.

    DOI: 10.1109/TII.2022.3148289

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85127486667&origin=inward

  • A Parkinson's Auxiliary Diagnosis Algorithm Based on a Hyperparameter Optimization Method of Deep Learning 査読有り

    Wang X., Li S., Pun C.M., Guo Y., Xu F., Gao H., Lu H.

    IEEE/ACM Transactions on Computational Biology and Bioinformatics   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Parkinson's disease is a common mental disease in the world, especially in the middle-aged and elderly groups. Today, clinical diagnosis is the main diagnostic method of Parkinson's disease, but the diagnosis results are not ideal, especially in the early stage of the disease. In this paper, a Parkinson's auxiliary diagnosis algorithm based on a hyperparameter optimization method of deep learning is proposed for the Parkinson's diagnosis. The diagnosis system uses ResNet50 to achieve feature extraction and Parkinson's classification, mainly including speech signal processing part, algorithm improvement part based on Artificial Bee Colony algorithm (ABC) and optimizing the hyperparameters of ResNet50 part. The improved algorithm is called Gbest Dimension Artificial Bee Colony algorithm (GDABC), proposing “Range pruning strategy” which aims at narrowing the scope of search and “Dimension adjustment strategy” which is to adjust gbest dimension by dimension. The accuracy of the diagnosis system in the verification set of Mobile Device Voice Recordings at King's College London (MDVR-CKL) dataset can reach more than 96%. Compared with current Parkinson's sound diagnosis methods and other optimization algorithms, our auxiliary diagnosis system shows better classification performance on the dataset within limited time and resources.

    DOI: 10.1109/TCBB.2023.3246961

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85149365581&origin=inward

  • JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution 査読有り

    Gao G., Tang L., Wu F., Lu H., Yang J.

    IEEE Transactions on Multimedia   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    With the growing importance of preventing the COVID-19 virus in cyber-manufacturing security, face images obtained in most video surveillance scenarios are usually low resolution together with mask occlusion. However, most of the previous face super-resolution solutions can not efficiently handle both tasks in one model. In this work, we consider both tasks simultaneously and construct an efficient joint learning network, called JDSR-GAN, for masked face super-resolution tasks. Given a low-quality face image with mask as input, the role of the generator composed of a denoising module and super-resolution module is to acquire a high-quality high-resolution face image. The discriminator utilizes some carefully designed loss functions to ensure the quality of the recovered face images. Moreover, we incorporate the identity information and attention mechanism into our network for feasible correlated feature expression and informative feature learning. By jointly performing denoising and face super-resolution, the two tasks can complement each other and attain promising performance. Extensive qualitative and quantitative results show the superiority of our proposed JDSR-GAN over some competitive methods.

    DOI: 10.1109/TMM.2023.3240880

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85148474882&origin=inward

  • Lightweight Real-Time Semantic Segmentation Network With Efficient Transformer and CNN 査読有り

    Xu G., Li J., Gao G., Lu H., Yang J., Yue D.

    IEEE Transactions on Intelligent Transportation Systems   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In the past decade, convolutional neural networks (CNNs) have shown prominence for semantic segmentation. Although CNN models have very impressive performance, the ability to capture global representation is still insufficient, which results in suboptimal results. Recently, Transformer achieved huge success in NLP tasks, demonstrating its advantages in modeling long-range dependency. Recently, Transformer has also attracted tremendous attention from computer vision researchers who reformulate the image processing tasks as a sequence-to-sequence prediction but resulted in deteriorating local feature details. In this work, we propose a lightweight real-time semantic segmentation network called LETNet. LETNet combines a U-shaped CNN with Transformer effectively in a capsule embedding style to compensate for respective deficiencies. Meanwhile, the elaborately designed Lightweight Dilated Bottleneck (LDB) module and Feature Enhancement (FE) module cultivate a positive impact on training from scratch simultaneously. Extensive experiments performed on challenging datasets demonstrate that LETNet achieves superior performances in accuracy and efficiency balance. Specifically, It only contains 0.95M parameters and 13.6G FLOPs but yields 72.8% mIoU at 120 FPS on the Cityscapes test set and 70.5% mIoU at 250 FPS on the CamVid test dataset using a single RTX 3090 GPU. Source code will be available at https://github.com/IVIPLab/LETNet.

    DOI: 10.1109/TITS.2023.3248089

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85149404169&origin=inward

  • Multi-receptive field spatiotemporal network for action recognition 査読有り

    Nie M., Yang S., Wang Z., Zhang B., Lu H., Yang W.

    International Journal of Machine Learning and Cybernetics   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Despite the great progress in action recognition made by deep neural networks, visual tempo may be overlooked in the feature learning process of existing methods. The visual tempo is the dynamic and temporal scale variation of actions. Existing models usually understand spatiotemporal scenes using temporal and spatial convolutions, which are limited in both temporal and spatial dimensions, and they cannot cope with differences in visual tempo changes. To address these issues, we propose a multi-receptive field spatiotemporal (MRF-ST) network to effectively model the spatial and temporal information of different receptive fields. In the proposed network, dilated convolution is utilized to obtain different receptive fields. Meanwhile, dynamic weighting for different dilation rates is designed based on the attention mechanism. Thus, the proposed MRF-ST network can directly caption various tempos in the same network layer without any additional cost. Moreover, the network can improve the accuracy of action recognition by learning more visual tempos of different actions. Extensive evaluations show that MRF-ST reaches the state-of-the-art on three popular benchmarks for action recognition: UCF-101, HMDB-51, and Diving-48. Further analysis also indicates that MRF-ST can significantly improve the performance at the scenes with large variances in visual tempo.

    DOI: 10.1007/s13042-023-01774-0

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146388497&origin=inward

  • Squeeze-and-Excitation Block Based Mask R-CNN for Object Instance Segmentation 査読有り

    Nagasawa K., Ishiyama S., Lu H., Kamiya T., Nakatoh Y., Serikawa S., Li Y.

    Communications in Computer and Information Science   1732 CCIS   56 - 64   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Deep learning-based methods have taken center stage in image recognition, such as AlexNet and deep learning-based method. At present, Image recognition based on deep learning has been widely used in agriculture, factory automation, automated driving, medical fields and so on. In the fields of automated driving and medical care, the accuracy of the image recognition directly affects human lives. For these reasons, the importance of improving the accuracy of image recognition is clear. In this paper, we focus on instance segmentation tasks. The method used is Mask R-CNN, which is the basis of current state-of-the-art methods. The network structure based on ResNet, and we tried to improve the accuracy by adding Squeeze-and-Excitation Block (SE Block). According to the result of experiments, it is proved that this method has certain advantages for object instance segmentation.

    DOI: 10.1007/978-981-99-2789-0_5

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85163323022&origin=inward

  • Pose Estimation of Point Sets Using Residual MLP in Intelligent Transportation Infrastructure 査読有り

    Li Y., Yin Z., Zheng Y., Lu H., Kamiya T., Nakatoh Y., Serikawa S.

    IEEE Transactions on Intelligent Transportation Systems   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    6D pose estimation of arbitrary objects is a crucial topic for intelligent transportation infrastructure measurement. However, some external environmental factors and the characteristics of the object itself impact the accuracy of the object’s pose estimation in practical applications. In this paper, we propose a new multi-class dataset ICD-4 (Industrial car Components Dataset) for 6D object pose estimation, which mainly includes four component categories, and every category takes 20,000 different scenarios. ICD-4 dataset delivers quite a few research challenges involving the range of object pose transformations and has significant research value for small-scale pose estimation tasks. We also propose an innovative method PoseMLP, a pose estimation network that uses residual MLP (multilayer perceptron) modules to predict the 6D pose estimation directly. Simultaneously, the experimental results demonstrate the effectiveness and reliability of the proposed method.

    DOI: 10.1109/TITS.2023.3250604

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85153361050&origin=inward

  • PointNetX: Part Segmentation Based on PointNet Promotion 査読有り

    Zhao K., Lu H., Li Y.

    Communications in Computer and Information Science   1732 CCIS   65 - 76   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Recently, point cloud learning has become widely utilized in a variety of domains, including autonomous driving, robotics, and computer vision. PointNet, a pioneer in point cloud processing, uses max pooling to address the disorder of point clouds. However, PointNet's method of mapping points to high-dimensional space, and then obtaining global features through maximum pooling still leads to a large loss of feature information. To this end, we suggest a new PointNet-based segmentation and classification network called PointNetX. PointNetX expands the network’s depth and the number of neurons. Simultaneously, we extracted the features of different layers to compensate for the loss caused by pooling and used a better rotation matrix to adjust the Angle. On the other hand, we add label smoothing to the loss function, and use Ranger optimizer in training to realize various tasks of point clouds. Experiments show that PointNetX has better performance in part segmentation as well as object classification than PointNet.

    DOI: 10.1007/978-981-99-2789-0_6

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85163355978&origin=inward

  • Underwater Image Restoration Based on Light Attenuation Prior and Scene Depth Fusion Model 査読有り

    Zhu X., Li Y., Lu H.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   14406 LNCS   41 - 53   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Underwater images often suffer from blurry details, color distortion, and low contrast due to light absorption and scattering in water. Existing restoration technologies use a fixed attenuation coefficient value, which fails to account for the uncertainty of the water body and leads to suboptimal restoration results. To address these issues, we propose a scene depth fusion model that considers underwater light attenuation to obtain a more accurate attenuation coefficient for image restoration. Our method employs the quadtree decom-position method and a depth map to estimate the background light. We then fuse and refine the depth map, compute the attenuation coefficient of the water medium for a more precise transmission map, and apply a reversed underwater imaging model to restore the image. Experiments demonstrate that our method effectively enhances the details and colors of underwater images while improving the contrast. Moreover, our method outperforms several state-of-the-art methods in terms of both accuracy and quality, showing its superior performance.

    DOI: 10.1007/978-3-031-47634-1_4

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85177169936&origin=inward

  • Remote Sensing Image Registration Based on Improved Geometric-Matching CNN 査読有り

    Morishima F., Lu H., Kamiya T.

    International Conference on Control, Automation and Systems   1745 - 1748   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Environmental change detection is one of the uses of satellite images. This process is performed by subtracting image pairs obtained by different time series or sensors. Therefore, image registration is an important pre-processing step in detection of environmental changes. Currently, image registration methods based on deep learning are gaining attention. In general, higher satellite image resolution results in more accurate registration. However, the increase in image size leads to higher computational costs during training and estimation of deep learning models. Then, we propose a method that reduces the number of parameters of the model to lower the computational cost while maintaining the accuracy. This method makes it easier to handle high-resolution satellite images. The proposed method modified the GMCNN (Geometric-matching CNN) architecture by adding CSA (Cosine Similarity Attention) and SE (Squeeze-and-Excitation) layers to enhance the feature map, and point-wise convolution to reduce the number of parameters. The improved GMCNN decreases the grid MSE by 0.0037 compared to the conventional GMCNN. It also reduces the number of parameters by 49.6%.

    DOI: 10.23919/ICCAS59377.2023.10316818

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85179178272&origin=inward

  • Underwater Visibility Enhancement IoT System in Extreme Environment 査読有り

    Li Y., Zhu X., Zheng Y., Lu H., Li J., Shen Z.

    IEEE Internet of Things Journal   2023年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Imagery captured in extreme underwater environments often presents unique challenges, including blurred details, color distortion, and reduced contrast. These discrepancies largely emanate from the intricate interplay of light absorption and scattering within the aquatic medium. Predominant restoration techniques, rather simplistically, apply a static attenuation coefficient, neglecting the dynamic nuances of underwater conditions, leading to an inconsistent restoration outcome. To counter these impediments, we introduce an avant-garde Underwater Internet of Things (Underwater IoT) system, underpinned by a scene-depth fusion paradigm. Our methodology astutely accounts for the spectral decay of light underwater to infer a more refined attenuation coefficient tailored to the specific scene. This system, employing a quadtree decomposition for precise localization coupled with depth mapping, facilitates an astute estimation of prevailing luminescence. This depth map, once synthesized and refined, aids in gauging the precise attenuation dynamics of the aqueous milieu, culminating in a more precise transmission map derivation. Segueing from this, we employ an inverse model to refurbish the original image. Experimental results highlight our system’s prowess in counteracting issues like muddied details and chromatic anomalies while concurrently amplifying contrast. In juxtaposition with a spectrum of existing methodologies, our innovation outshines in terms of finesse and accuracy, underscoring its unparalleled efficacy in the challenging underwater conditions.

    DOI: 10.1109/JIOT.2023.3320739

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85173065401&origin=inward

  • DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing 査読有り

    Jiang X., Xu X., Chen Z., Zhang J., Song J., Shen F., Lu H., Shen H.T.

    MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia   719 - 727   2022年10月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    The Weakly-Supervised Audio-Visual Video Parsing (AVVP) task aims to parse a video into temporal segments and predict their event categories in terms of modalities, labeling them as either audible, visible, or both. Since the temporal boundaries and modalities annotations are not provided, only video-level event labels are available, this task is more challenging than conventional video understanding tasks.Most previous works attempt to analyze videos by jointly modeling the audio and video data and then learning information from the segment-level features with fixed lengths. However, such a design exist two defects: 1) The various semantic information hidden in temporal lengths is neglected, which may lead the models to learn incorrect information; 2) Due to the joint context modeling, the unique features of different modalities are not fully explored. In this paper, we propose a novel AVVP framework termedDual Hierarchical Hybrid Network (DHHN) to tackle the above two problems. Our DHHN method consists of three components: 1) A hierarchical context modeling network for extracting different semantics in multiple temporal lengths; 2) A modality-wise guiding network for learning unique information from different modalities; 3) A dual-stream framework generating audio and visual predictions separately. It maintains the best adaptions on different modalities, further boosting the video parsing performance. Extensive quantitative and qualitative experiments demonstrate that our proposed method establishes the new state-of-the-art performance on the AVVP task.

    DOI: 10.1145/3503161.3548309

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85147911945&origin=inward

  • Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval 査読有り

    Wang K., Wang Y., Xu X., Liu X., Ou W., Lu H.

    MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia   601 - 609   2022年10月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is an emerging research task that aims to retrieve data of new classes across sketches and images. It is challenging due to the heterogeneous distributions and the inconsistent semantics across seen and unseen classes of the cross-modal data of sketches and images. To realize knowledge transfer, the latest approaches introduce knowledge distillation, which optimizes the student network through the teacher signal distilled from the teacher network pre-trained on large-scale datasets. However, these methods often ignore the mispredictions of the teacher signal, which may make the model vulnerable when disturbed by the wrong output of the teacher network. To tackle the above issues, we propose a novel method termed Prototype-based Selective Knowledge Distillation (PSKD) for ZS-SBIR. Our PSKD method first learns a set of prototypes to represent categories and then utilizes an instance-level adaptive learning strategy to strengthen semantic relations between categories. Afterwards, a correlation matrix targeted for the downstream task is established through the prototypes. With the learned correlation matrix, the teacher signal given by transformers pre-trained on ImageNet and fine-tuned on the downstream dataset, can be reconstructed to weaken the impact of mispredictions and selectively distill knowledge on the student network. Extensive experiments conducted on three widely-used datasets demonstrate that the proposed PSKD method establishes the new state-of-the-art performance on all datasets for ZS-SBIR.

    DOI: 10.1145/3503161.3548382

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85150965204&origin=inward

  • Face Illumination Transfer and Swapping via Dense Landmark and Semantic Parsing 査読有り

    Jin X., Li Z., Ning N., Lu H., Li X., Zhang X., Zhu X., Fang X.

    IEEE Sensors Journal   22 ( 18 )   17391 - 17398   2022年09月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    The image-based virtual illumination technology directly changes the illumination effect of objects on the image. It does not require complex light propagation simulation calculations, it is an image-based rendering technology. The image mainly relies on the imaging of the visual sensor, and uses the virtual illumination technology of the image to illuminate the image obtained by the visual sensor. This is a new cross-cutting direction in computer vision, virtual reality and other fields. Face Illumination Swapping via Dense Landmark and Semantic Parsing is a major branch. Keeping the geometrical features of the target images and relighting the entire image instead of the face area are problems to be solved in the research. This paper based on the three-dimensional model to analyze the illumination information of the face images and re-render the illumination of the target face, and finally achieves the illumination swap between the two face images. We designed and implemented a 3DDFA-based face image illumination transfer method. First, 3DDFA is used to reconstruct the target face image. Estimate the surface normal and albedo. Then align and fill the surface normal and face parsing to illuminate the face image for light rendering and illumination transfer of the face images. Finally, the illumination analysis and re-rendering of face images based on 3DDFA are expanded to achieve the swap of illumination between face images. Experimental results show that this method can generate good effects of face image illumination transmission and swap while keeping the geometric features of the target images.

    DOI: 10.1109/JSEN.2020.3025918

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85139147050&origin=inward

  • Improving the Accuracy of Road Surface Distinction Based on Reflection Intensity Variations Using Ultrasonic Sensor 査読有り

    Nakashima S., Arimura H., Yamamoto M., Mu S., Lu H.

    IEEE Sensors Journal   22 ( 18 )   17399 - 17405   2022年09月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Fall cases in the daily lives of elderly people are increasing owing to the progressing aging society problem. The falls of elderly people usually have correlations with poor physical conditions and judgment and cause severe consequences. To prevent elderly people from falling, a movement support system was developed in our previous research. The system employs a reflection intensity value obtained by an ultrasonic sensor to identify the type of road surface and sounds an alert when the road surface is dangerous and may cause the user to slip or fall. However, the conventional method does not solve the problem of low detection rate on rough road surfaces without considering the variation in the reflection intensity due to the change in the shape of the road surface. In this paper, the variation in the reflection intensity is measured by investigations at multiple types of locations. The variation in the reflection intensity is introduced as a parameter to improve the road surface identification accuracy. The effectiveness of the proposed method in this paper is verified by experiments on four types of road surfaces. The necessary number of reflection intensity samples for road surface identification is also discussed.

    DOI: 10.1109/JSEN.2020.3033015

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85139164898&origin=inward

  • Query-based black-box attack against medical image segmentation model 査読有り

    Li S., Huang G., Xu X., Lu H.

    Future Generation Computer Systems   133   331 - 337   2022年08月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    With the extensive deployment of deep learning, the research on adversarial example receives more concern than ever before. By modifying a small fraction of the original image, an adversary can lead a well-trained model to make a wrong prediction. However, existing works about adversarial attack and defense mainly focus on image classification but pay little attention to more practical tasks like segmentation. In this work, we propose a query-based black-box attack that could alter the classes of foreground pixels within a limited query budget. The proposed method improves the Adaptive Square Attack by employing a more accurate gradient estimation of loss and replacing the fixed variance of adaptive distribution with a learnable one. We also adopt a novel loss function proposed for attacking medical image segmentation models. Experiments on a widely-used dataset and well-known models demonstrate the effectiveness and efficiency of the proposed method in attacking medical image segmentation models. The implementation code and extensive analysis are available at https://github.com/Ikracs/medical_attack.

    DOI: 10.1016/j.future.2022.03.008

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85127684440&origin=inward

  • Feature Distillation Interaction Weighting Network for Lightweight Image Super-resolution 査読有り

    Gao G., Li W., Li J., Wu F., Lu H., Yu Y.

    Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022   36   661 - 669   2022年06月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Convolutional neural networks based single-image super-resolution (SISR) has made great progress in recent years. However, it is difficult to apply these methods to real-world scenarios due to the computational and memory cost. Meanwhile, how to take full advantage of the intermediate features under the constraints of limited parameters and calculations is also a huge challenge. To alleviate these issues, we propose a lightweight yet efficient Feature Distillation Interaction Weighted Network (FDIWN). Specifically, FDIWN utilizes a series of specially designed Feature Shuffle Weighted Groups (FSWG) as the backbone, and several novel mutual Wide-residual Distillation Interaction Blocks (WDIB) form an FSWG. In addition, Wide Identical Residual Weighting (WIRW) units and Wide Convolutional Residual Weighting (WCRW) units are introduced into WDIB for better feature distillation. Moreover, a Wide-Residual Distillation Connection (WRDC) framework and a Self-Calibration Fusion (SCF) unit are proposed to interact features with different scales more flexibly and efficiently. Extensive experiments show that our FDIWN is superior to other models to strike a good balance between model performance and efficiency. The code is available at https://github.com/IVIPLab/FDIWN.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85129827653&origin=inward

  • Three-dimensional Object Detection Algorithm Based on Deep Neural Networks for Automatic Driving 査読有り

    Lu H., Yang S.

    Beijing Gongye Daxue Xuebao/Journal of Beijing University of Technology   48 ( 6 )   589 - 597   2022年06月

     詳細を見る

    記述言語:中国語   掲載種別:研究論文(学術雑誌)

    In the automatic driving scene, LiDAR camera is usually used to obtain the point cloud data with high accuracy and perceptible distance. Therefore, achieving object detection by effectively using point cloud data is the key technology to complete the automatic driving task. Point cloud has the problems of sparsity, disorder, and large amount. Traditional deep learning object detection method is difficult to effectively extract features map and meet the accuracy requirements. This paper proposed a 3D object detection method based on the fusion of voxel convolution network and multi-layer perception model. The voxel convolution network was used to extract the global features of point cloud, combined with the local features and distance relationship of point cloud extracted by multi-layer perception. It can improve the accuracy and speed of 3D object classification and position prediction. In this paper, the KITTI dataset was used to compare the proposed method with the classical method. According to the experimental results, the accuracy of the proposed method is significantly improved than the previous methods.

    DOI: 10.11936/bjutxb2021100027

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85131694706&origin=inward

  • Learning Cross-Modal Common Representations by Private-Shared Subspaces Separation 査読有り

    Xu X., Lin K., Gao L., Lu H., Shen H.T., Li X.

    IEEE Transactions on Cybernetics   52 ( 5 )   3261 - 3275   2022年05月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Due to the inconsistent distributions and representations of different modalities (e.g., images and texts), it is very challenging to correlate such heterogeneous data. A standard solution is to construct one common subspace, where the common representations of different modalities are generated to bridge the heterogeneity gap. Existing methods based on common representation learning mostly adopt a less effective two-stage paradigm: first, generating separate representations for each modality by exploiting the modality-specific properties as the complementary information, and then capturing the cross-modal correlation in the separate representations for common representation learning. Moreover, these methods usually neglect that there may exist interference in the modality-specific properties, that is, the unrelated objects and background regions in images or the noisy words and incorrect sentences in the text. In this article, we hypothesize that explicitly modeling the interference within each modality can improve the quality of common representation learning. To this end, we propose a novel model private-shared subspaces separation (P3S) to explicitly learn different representations that are partitioned into two kinds of subspaces: 1) the common representations that capture the cross-modal correlation in a shared subspace and 2) the private representations that model the interference within each modality in two private subspaces. By employing the orthogonality constraints between the shared subspace and the private subspaces during the one-stage joint learning procedure, our model is able to learn more effective common representations for different modalities in the shared subspace by fully excluding the interference within each modality. Extensive experiments conducted on cross-modal retrieval verify the advantages of our P3S method compared with 15 state-of-the-art methods on four widely used cross-modal datasets.

    DOI: 10.1109/TCYB.2020.3009004

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85130766056&origin=inward

  • Cognitive Memory-Guided AutoEncoder for Effective Intrusion Detection in Internet of Things 査読有り

    Lu H., Wang T., Xu X., Wang T.

    IEEE Transactions on Industrial Informatics   18 ( 5 )   3358 - 3366   2022年05月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    With the development of the Internet of Things (IoT) technology, intrusion detection has become a key technology that provides solid protection for IoT devices from network intrusion. At present, artificial intelligence technologies have been widely used in the intrusion detection task in previous methods. However, unknown attacks may also occur with the development of the network and the attack samples are difficult to collect, resulting in unbalanced sample categories. In this case, the previous intrusion detection methods have the problem of high false positive rates and low detection accuracy, which restricts the application of these methods in a real situation. In this article, we propose a novel method based on deep neural networks to tackle the intrusion detection task, which is termed Cognitive Memory-guided AutoEncoder (CMAE). The CMAE method leverages a memory module to enhance the ability to store normal feature patterns while inheriting the advantages of autoencoder. Therefore, it is robust to the imbalanced samples. Besides, using the reconstruction error as an evaluation criterion to detect attacks effectively detects unknown attacks. To obtain superior intrusion detection performance, we propose feature reconstruction loss and feature sparsity loss to constrain the proposed memory module, promoting the discriminative of memory items and the ability of representation for normal data. Compared to previous state-of-the-art methods, sufficient experimental results reveal that the proposed CMAE method achieves excellent performance and effectiveness for intrusion detection.

    DOI: 10.1109/TII.2021.3102637

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85124613569&origin=inward

  • Answer Again: Improving VQA with Cascaded-Answering Model 査読有り

    Peng L., Yang Y., Zhang X., Ji Y., Lu H., Shen H.T.

    IEEE Transactions on Knowledge and Data Engineering   34 ( 4 )   1644 - 1655   2022年04月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Visual Question Answering (VQA) is a very challenging task, which requires to understand visual images and natural language questions simultaneously. In the open-ended VQA task, most previous solutions focus on understanding the question and image contents, as well as their correlations. However, they mostly reason the answers in a one-stage way, which results in that the generated answers are significantly ignored. In this paper, we propose a novel approach, termed Cascaded-Answering Model (CAM), which extends the conventional one-stage VQA model to a two-stage model. Hence, the proposed model can fully explore the semantics embedded in the predicted answers. Specifically, CAM is composed of two cascaded answering modules: Candidate Answer Generation (CAG) module and Final Answer Prediction (FAP) module. In CAG module, we select multiple relevant candidates from the generated answers using a typical VQA approach with Co-Attention. While in FAP module, we integrate the information of question and image, together with the semantics explored from the selected candidate answers to predict the final answer. Experimental results demonstrate that the proposed model produces high-quality candidate answers and achieves the state-of-the-art performance on five large benchmark datasets, VQA-1.0, VQA-2.0, VQA-CP v2, TDIUC and COCO-QA.

    DOI: 10.1109/TKDE.2020.2998805

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85126530040&origin=inward

  • Multidimensional Deformable Object Manipulation Based on DN-Transporter Networks 査読有り

    Teng Y., Lu H., Li Y., Kamiya T., Nakatoh Y., Serikawa S., Gao P.

    IEEE Transactions on Intelligent Transportation Systems   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In the process of transportation, the handling and loading methods of rigid objects are becoming more and more perfect. However, whether in today's transportation system or in daily life, such as packing objects or sorting cables before transportation, the manipulation of deformable objects has been always inevitable and has attracted more and more attention. Due to the super degrees of freedom and the unpredictable physical state of deformed objects. It is difficult for robots to complete tasks under the environment of the deformable object. Therefore, we present a method based on imitation learning. In the generated expert demonstration, the agent is offered to learn the state sequence, and then imitate the expert's trajectory sequence which avoid the above-mentioned difficulties. In addition, compared with the baseline method, our proposed DN-Transporter Networks are more competitive in a simulation environment involving cloth, ropes or bags.

    DOI: 10.1109/TITS.2022.3168303

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85129430885&origin=inward

  • Learning Latent Dynamics for Autonomous Shape Control of Deformable Object 査読有り

    Lu H., Teng Y., Li Y.

    IEEE Transactions on Intelligent Transportation Systems   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In recent years, the methods of loading and transporting rigid objects have become more and more perfect. However, in the process of transportation, the shape control of deformable objects has attracted extensive attention because deformable objects have been widely used in intelligent tasks such as packing and sorting cables before transportation. Restricted by the super-degrees of freedom and nonlinear dynamic models of deformable objects, planning the action trajectories to control the shape of deformable objects is a challenging task. In this work, we use contrastive learning to solve the shape control problem of deformable objects. The method jointly optimizes the visual representation model and dynamic model of deformable objects, maps the target nonlinear state to linear latent space which avoids model inference for deformable objects in infinite-dimensional configuration spaces. Furthermore, to extract effective information in the latent space, we construct an encoder with a multi-branch topology to improve the representation ability of the model. Experimentally, we collect dynamic trajectory data for random shape control task involving cloth or rope in a simulated environment. Then we apply it to train the proposed offline method to obtain latent dynamic models for shape control of deformable objects. In comparison with other baseline methods, our proposed method achieves substantial performance improvements.

    DOI: 10.1109/TITS.2022.3225322

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85144807815&origin=inward

  • 3D object detection using improved PointRCNN 査読有り

    Fukitani K., Shin I., Lu H., Yang S., Kamiya T., Nakatoh Y., Serikawa S.

    Cognitive Robotics   2   242 - 254   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Recently, two-dimensional object detection (2D object detection) has been introduced in numerous applications such as building exterior diagnosis, crime prevention and surveillance, and medical fields. However, the distance (depth) information is not enough for indoor robot navigation, robot grasping, autonomous running, and so on, with conventional object detection. Therefore, in order to improve the accuracy of 3D object detection, this paper proposes an improvement of Point RCNN, which is a segmentation-based method using RPNs and has performed well in 3D detection benchmarks on the KITTI dataset commonly used in recognition tasks for automatic driving. The proposed improvement is to improve the network in the first stage of generating 3D box candidates in order to solve the problem of frequent false positives. Specifically, we added a Squeeze and Excitation (SE) Block to the network of pointnet++ that performs feature extraction in the first stage and changed the activation function from ReLU to Mish. Experiments were conducted on the KITTI dataset, which is commonly used in research aimed at automated driving, and an accurate comparison was conducted using AP. The proposed method outperforms the conventional method by several percent on all three difficulty levels.

    DOI: 10.1016/j.cogr.2022.12.001

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85144888388&origin=inward

  • 3D object recognition for coordination-less bin-picking automation 査読有り

    Ishiyama S., Lu H.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    At present, industrial robots are required to support high-mix low-volume production, which calls for automation and flexibility in production lines. To realize these requirements, it is necessary that automate of bin picking. In this work, we propose a three-dimensional object recognition using deep learning to automate coordination-less bin picking. The deep learning-based method requires the use of training data, but it has the higher cost of annotating the training data. Therefore, we construct a coordination-less recognition model by using the automatic acquisition method for training data in a simulation environment. For the evaluation, we conducted experiments in a simulation and real environment to verify the accuracy of the proposed method.

    DOI: 10.1117/12.2663323

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146686064&origin=inward

  • Automatic Classification of Respiratory Sound Considering Hierarchical Structure 査読有り

    Tabata M., Lu H., Kamiya T., Mabu S., Kido S.

    International Conference on Control, Automation and Systems   2022-November   537 - 541   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Respiratory diseases are one of the leading causes of death worldwide. Approximately 8 million people die annually from respiratory diseases. Diagnosis is made primarily by auscultation using a stethoscope. The lack of quantitative criteria makes diagnosis difficult in the field where physicians are in short supply. To solve this problem, a computer aided diagnosis (CAD) system that quantitatively analyzes and classifies respiratory sounds and outputs them as a 'second opinion' is needed. In this paper, HPSS (Harmonious/Percussive Sound Separation) is used to separate abnormal respiratory sound features. Images are generated from the spectral envelopes obtained by linear prediction coefficients (LPC) for each of the three types of respiratory sound data before separation. The CNN (convolutional neural networks) framework based on hierarchical structure of the correct labels is introduced. The proposed method was applied to the dataset used in the International Conference on Biomedical and Health Informatics (ICBHI) 2017 Challenge. As a result, we obtained a sensitivity of 63.5%, specificity of 85.1%, average score of 74.3%, harmonic score of 72.7%, area under the curve of 87.8%, and false negative rate of24.5%, respectively.

    DOI: 10.23919/ICCAS55662.2022.10003771

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146537757&origin=inward

  • Environment Recognition from Spherical Camera Images Based on Multi-Attention DeepLab 査読有り

    Nishida Y., Guangxu L., Lu H., Kamiya T.

    International Conference on Control, Automation and Systems   2022-November   204 - 208   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Electric wheelchair is an easy-to-operate means of transportation that does not require physical strength. With the number of electric wheelchair users increasing in recent years, the increase in traffic accidents becomes a problem. Therefore, by developing an autonomous electric wheelchair, it is expected that the risk of accidents will be reduced and the convenience of the electric wheelchair will be improved. Environment recognition is indispensable for the development of autonomous electric wheelchairs. We propose a semantic segmentation method for recognizing 16 objects in traffic environment. This paper examines the improvement of problems such as the high price of autonomous electric wheelchairs due to the increase in the number of sensors used, which has been a concern in related research. Therefore, we use panoramic images acquired by a spherical camera as input data, and extern the Multi-Attention Deep Lab algorithms fitting for the recognition of distorted images. A new CNN model is constructed using Deep Lab v3+, scSE Block, Pairwise Self-Attention, and Joint Pyramid Up-sampling. We conducted a recognition experiment using images taken on campus and verified its effectiveness. (Comparing to DeepLab v3+, IoU and Dice showed a 3.5% and 3.6% improvement in accuracy, respectively.)

    DOI: 10.23919/ICCAS55662.2022.10003689

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146621100&origin=inward

  • Generation of Super-Resolution Images from Satellite Images Based on Improved RCAN 査読有り

    Morishima F., Lu H., Kamiya T.

    International Conference on Control, Automation and Systems   2022-November   213 - 216   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Satellite images can be analyzed and used for a variety of purposes. In the future, satellite image analysis will become more important since the number of satellites launches, and the amount of satellite data increase every year. Under these circumstances, there are some problems to be solved. One is the existence of low-resolution satellite images. To analyze the lower resolution of satellite images there are some technical issues such as reduction of noise, misclassification of object recognition. Therefore, high-resolution images are necessary. However, high-resolution satellite images are expensive, and its images may not be available in the past satellite images. Super-resolution which is introduced in image processing is a method to solve these problems. Convolutional neural network (CNN)-based methods are effective, and there is a need for models that can perform super-resolution with higher accuracy. In this paper, we propose a method for super-resolving satellite images, based on the improved the RCAN (residual channel attention network) model with SRM (style-based recalibration module). The proposed method improves the PSNR (peak signal to noise ratio) by 0.0181 dB compared to the conventional RCAN model.

    DOI: 10.23919/ICCAS55662.2022.10003856

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146557504&origin=inward

  • Improved Point-Voxel Region Convolutional Neural Network for Small Object Detection 査読有り

    Xie Z., Tsuzaki M., Lu H., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    With the widespread use of LiDAR sensors, 3D object detection through 3D point cloud data processing has become a research target in robotics and autonomous driving. However, the disorder and sparsity of point cloud data are the problems in traditional point cloud data processing. It is challenging to detect objects using a large amount of point cloud data. Conventional 3D object detectors have mainly grid-based methods and point-based methods. PV-RCNN proposed a framework that combines voxel-based and point-based techniques, and object features are extracted using 3D voxel CNNs. However, the resolution reduction caused by the CNN affects the localization of objects. This study aims to improve the detection accuracy of more minor things by feeding not only a single output of the voxel CNN but also multiple outputs, including high-resolution outputs, to the RPN. We came out with a new network that introduces the Multi-Scale Region Proposal Network to reduce the effect of resolution degradation. Our network has better recognition accuracy for small objects like bicycles than the original PV-RCNN. In extensive experiments, we demonstrate that our model has a 5% improvement for small things, such as cyclists training on the KITTI dataset.

    DOI: 10.1117/12.2657106

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146716974&origin=inward

  • Recognition of Sidewalk Environment Based on WideSegPlus 査読有り

    Sakai Y., Lu H., Li Y., Kamiya T.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, the elderly population in Japan has been increasing. Expectations for the utilization of welfare equipment are also increasing. Electric wheelchairs are one of equipment and are widely used as a convenient means of transportation. On the other hand, accidents have also occurred, and dangers have been pointed out when driving the electric wheelchair. Therefore, we believe that the development of an autonomous mobile electric wheelchair can improve the causes of accidents. In addition, it can be expected to reduce accidents and improve the convenience of electric wheelchairs. For the development of an autonomous electric wheelchair, environment recognition such as estimation of the current position, recognition of sidewalks and traffic lights, and prediction of movement of objects is indispensable. To solve these problems, we develop an algorithm to recognize the sidewalks, crosswalks, and traffic lights from video images. In recent years, deep learning has been widely applied in the field of image recognition. Therefore, we improve WideSeg, one of the semantic segmentation algorithms that apply CNN (Convolutional Neural Networks), and develop an object recognition method using a new CNN model. In our approach, we perform adding the sidewalk correction and noise removal processing after performing semantic segmentation with the proposed model.

    DOI: 10.1117/12.2655680

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146673279&origin=inward

  • Underwater Image Super-Resolution Using Improved SRCNN 査読有り

    Horimachi R., Lu H., Zheng Y., Kamiya T., Nakatoh Y., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, industrialization and economic development in countries around the world have led to an ever-increasing demand for energy. Renewable energies are attracting attention, but they still often use mineral resources such as coal, petroleum, and natural gas, and onshore resources are depleting day by day. These energy and metal resources, such as copper, support Japan's industries and affluent lifestyle, and if Japan continues to rely on imports for most of these resources, it will become difficult for Japan to secure a stable supply of these energies and resources. Therefore, mining of mineral resources on the seafloor is essential to solve these problems, and research on seafloor resource surveys and mining is underway. Because direct human exploration and mining of seafloor resources are naturally dangerous, underwater robots are used to explore and mine seafloor resources. However, due to light absorption and turbidity in water, the underwater image of an underwater robot is sometimes less visible, making exploration unsatisfactory. Therefore, there is a need for higher-resolution underwater images of underwater robots. In this study, we perform super-resolution of underwater images using an improved SRCNN to support research on underwater images of underwater robots. The conventional SRCNN method uses the ReLU function as the activation function, but the improved SRCNN uses the PReLU function and FReLU function, which are extended activation functions of the ReLU function, to improve accuracy.

    DOI: 10.1117/12.2655051

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146731084&origin=inward

  • Underwater Video Networking and Target Tracking 査読有り

    Ota H., Lu H., Li J., Zheng Y., Kamiya T., Nakatoh Y., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    IoT technology has made remarkable progress in recent years, and the world is full of IoT devices that continue to evolve every day. From smartphones, personal computers, and smartwatches to home appliances such as refrigerators and washing machines, and even indoor lights and house keys, IoT devices have become an inseparable part of our lives. In addition to devices used by individuals, IoT technology supports our daily lives from both front and back sides, such as IoT-enabled industrial equipment and satellite positioning systems. Japan has been making a national push to shift to IoT in industries that reduce the burden on workers and have recently been promoting a plan called Smart Agriculture, Forestry, and Fisheries. Among these three types of industries, the agricultural sector is slightly ahead of the others, with the Smart Agriculture Demonstration Project starting in 2019, and 182 districts in Japan are implementing the project by FY2021. The forestry and fisheries industries are also developing daily to become next-generation industries based on the program established in December 2019, although they are behind agriculture. However, the examples mentioned so far are those that have been promoted with the help of companies and the national government, although everyone has benefited from them. In this paper, we propose an IoT device that can be used as an IoT buoy by using a microcomputer, Raspberry Pi, to create a camera device that can distribute underwater images in real-time and track its location.

    DOI: 10.1117/12.2655052

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146731793&origin=inward

  • 3D Object Classification from Point Clouds 査読有り

    Li Y., Lu H.

    Proceedings of SPIE - The International Society for Optical Engineering   12508   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Artificial intelligence has achieved a breakthrough with the proposal and development of deep learning. Compared with traditional models, deep learning allows machines to extract features and train neural networks by learning weight parameters. Convolutional Neural Networks (CNN), as the top priority of deep learning, have achieved remarkable results in 2D image recognition and classification segmentation. Recently, points cloud is a recent hot 3D data form in the field of deep learning. Point clouds retain better spatial geometric information than other forms of 3D data such as mesh depth maps. Due to the disorder, rotation invariance, the uneven density distribution of 3D point clouds, high sensor noise, and complex scenes, deep learning of 3D point clouds is still in the initial stage, and there are significant challenges. The tasks of deep learning for point clouds are mainly classified into shape classification, instance segmentation, semantic segmentation, etc. This article specifically outlines the development of methods for shape classification tasks and the characteristics and differences of each method. In addition, a comparison of the training accuracy and efficiency of each method on the dataset is provided.

    DOI: 10.1117/12.2658785

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146672572&origin=inward

  • Joint Semantic-Instance Segmentation Method for Intelligent Transportation System 査読有り

    Li Y., Cai J., Zhou Q., Lu H.

    IEEE Transactions on Intelligent Transportation Systems   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Getting the point cloud data from sensors and correctly understanding the scene is the core of the intelligent transportation system. Point cloud segmentation can help intelligent transportation systems distinguish different objects in the scene. Some methods process the point cloud through a feature extraction network and complete the segmentation task. However, these methods have high requirements on the feature extraction network, and the fineness of the features will directly affect the final segmentation result. In this paper, we propose a new feature extraction network for segmentation by adding an encoder-decoder structure, which can extract the multiscale local feature information from the feature map. In our opinion, the merged multiscale features obtain a better feature matrix, which improves the performance of the segmentation tasks. We report results on the S3DIS dataset, new feature extraction network greatly improves both semantic segmentation and instance segmentation tasks.

    DOI: 10.1109/TITS.2022.3190369

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85135241023&origin=inward

  • Context-Patch Representation Learning with Adaptive Neighbor Embedding for Robust Face Image Super-Resolution 査読有り

    Gao G., Yu Y., Lu H., Yang J., Yue D.

    IEEE Transactions on Multimedia   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Representation learning steered robust face image super-resolution (FSR) methods have attracted extensive attention in the past few decades. Most previous methods were devoted to exploiting the local position patches in the training set for FSR. However, they usually overlooked the sufficient usage of the contextual information around the testing patches, which are useful for stable representation learning. In this paper, we attempt to utilize the context-patch around the testing patch and propose a method named context-patch representation learning with adaptive neighbor embedding (CRL-ANE) for FSR. On one hand, we simultaneously use the testing position patch and its adjacent ones for stable representation weight learning. This contextual information can compensate for recovering missing details in the target patch. On the other hand, for each input patch set, due to its inherent facial structural properties, we design an adaptive neighbor embedding strategy to elaborately and adaptively choose primary candidates for more accurate reconstruction. These two improvements enable the proposed method to achieve better SR performance than some of the other methods. Qualitative and quantitative experiments on some benchmarks have validated the superiority of the proposed method over some state-of-the-art methods.

    DOI: 10.1109/TMM.2022.3192769

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85135204848&origin=inward

  • Grasp Position Estimation from Depth Image Using Stacked Hourglass Network Structure 査読有り

    Hamamoto K., Lu H., Li Y., Kamiya T., Nakatoh Y.O., Serikawa S.

    Proceedings - 2022 IEEE 46th Annual Computers, Software, and Applications Conference, COMPSAC 2022   1188 - 1192   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, robots have been used not only in factories. However, most robots currently used in such places can only perform the actions programmed to perform in a predefined space. For robots to become widespread in the future, not only in factories, distribution warehouses, and other places but also in homes and other environments where robots receive complex commands and their surroundings are constantly being updated, it is necessary to make robots intelligent. Therefore, this study proposed a deep learning grasp position estimation model using depth images to achieve intelligence in pick-and-place. This study used only depth images as the training data to build the deep learning model. Some previous studies have used RGB images and depth images. However, in this study, we used only depth images as training data because we expect the inference to be based on the object's shape, independent of the color information of the object. By performing inference based on the target object's shape, the deep learning model is expected to minimize the need for re-training when the target object package changes in the production line since it is not dependent on the RGB image. In this study, we propose a deep learning model that focuses on the stacked encoder-decoder structure of the Stacked Hourglass Network. We compared the proposed method with the baseline method in the same evaluation metrics and a real robot, which shows higher accuracy than other methods in previous studies.

    DOI: 10.1109/COMPSAC54236.2022.00187

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85137004247&origin=inward

  • Robotic Grasp Detection for Parallel Grippers: A Review 査読有り

    Yin Z., Li Y., Cai J., Lu H.

    Proceedings - 2022 IEEE 46th Annual Computers, Software, and Applications Conference, COMPSAC 2022   1184 - 1187   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    With the continuous progress of robot grasping technology, the application of robots in industrial applications is promoted. However, reliable grasping of any object is still a difficult problem for robot grasping tasks. In this paper, the parallel grabber is studied as the grabber used in robot grabber detection. The grab detection includes the two-dimensional plane grab method and six-degree-of-freedom grab method, in which the former is constrained to grab from one direction. This paper summarizes the development trend of the two methods and analyzes their advantages and disadvantages.

    DOI: 10.1109/COMPSAC54236.2022.00186

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85136942081&origin=inward

  • Edge Computing with Complementary Capsule Networks for Mental State Detection in Underground Mining Industry 査読有り

    Wang M., Wang J., Li Y., Lu H.

    IEEE Transactions on Industrial Informatics   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Most safety accidents are caused by human factor in underground resource mining industry. This is because the non-uniform lighted and noisy and dangerous environment easily evokes the negative mental state and results in nonstandard production operation. This paper proposes an edge computing mental state framework of the internet of things in the underground mining industry. Moreover, a filtering algorithm using a defined threshold function is developed. Furthermore, a complemented capsule network model is constructed by using two residual modules. In addition, a two-stage mental state fusion algorithm is proposed with EEG signals and facial expression. At last, the mental state variation characteristics are explored with the illuminating and coloring. Experiments show that the mental state detection accuracy is increased by 2.6%. The higher mental arousal is at the illumination between 320 Lx and 330 Lx.

    DOI: 10.1109/TII.2022.3218839

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85141632945&origin=inward

  • Towards Secure and Privacy-Preserving Data Sharing for COVID-19 Medical Records: A Blockchain-Empowered Approach 査読有り

    Tan L., Yu K., Shi N., Yang C., Wei W., Lu H.

    IEEE Transactions on Network Science and Engineering   9 ( 1 )   271 - 281   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    COVID-19 is currently a major global public health challenge. In the battle against the outbreak of COVID-19, how to manage and share the COVID-19 Electric Medical Records (CEMRs) safely and effectively in the world, prevent malicious users from tampering with CEMRs, and protect the privacy of patients are very worthy of attention. In particular, the semi-trusted medical cloud platform has become the primary means of hospital medical data management and information services. Security and privacy issues in the medical cloud platform are more prominent and should be addressed with priority. To address these issues, on the basis of ciphertext policy attribute-based encryption, we propose a blockchain-empowered security and privacy protection scheme with traceable and direct revocation for COVID-19 medical records. In this scheme, we perform the blockchain for uniform identity authentication and all public keys, revocation lists, etc are stored on a blockchain. The system manager server is responsible for generating the system parameters and publishes the private keys for the COVID-19 medical practitioners and users. The cloud service provider (CSP) stores the CEMRs and generates the intermediate decryption parameters using policy matching. The user can calculate the decryption key if the user has private keys and intermediate decrypt parameters. Only when attributes are satisfied access policy and the user's identity is out of the revocation list, the user can get the intermediate parameters by CSP. The malicious users may track according to the tracking list and can be directly revoked. The security analysis demonstrates that the proposed scheme is indicated to be safe under the Decision Bilinear Diffie-Hellman (DBDH) assumption and can resist many attacks. The simulation experiment demonstrates that the communication and storage overhead is less than other schemes in the public-private key generation, CEMRs encryption, and decryption stages. Besides, we also verify that the proposed scheme works well in the blockchain in terms of both throughput and delay.

    DOI: 10.1109/TNSE.2021.3101842

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112670673&origin=inward

  • Generalizable Crowd Counting via Diverse Context Style Learning 査読有り

    Zhao W., Wang M., Liu Y., Lu H., Xu C., Yao L.

    IEEE Transactions on Circuits and Systems for Video Technology   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Existing crowd counting approaches predominantly perform well on the training-testing protocol. However, due to large style discrepancies not only among images but also within a single image, they suffer from obvious performance degradation when applied to unseen domains. In this paper, we aim to design a generalizable crowd counting framework which is trained on a source domain but can generalize well on the other domains. To reach this, we propose a gated ensemble learning framework. Specifically, we first propose a diverse fine-grained style attention model to help learn discriminative content feature representations, allowing for exploiting diverse features to improve generalization. We then introduce a channel-level binary gating ensemble model, where diverse feature prior, input-dependent guidance and density grade classification constraint are implemented, to optimally select diverse content features to participate in the ensemble, taking advantage of their complementary while avoiding redundancy. Extensive experiments show that our gating ensemble approach achieves superior generalization performance among four public datasets. Codes are publicly available at https://github.com/wdzhao123/DCSL.

    DOI: 10.1109/TCSVT.2022.3146459

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85123719773&origin=inward

  • Multifeature Fusion-Based Object Detection for Intelligent Transportation Systems 査読有り

    Yang S., Lu H., Li J.

    IEEE Transactions on Intelligent Transportation Systems   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    The detection of 3D objects with high precision from point cloud data has become a crucial research topic in intelligent transportation systems. By effectively modeling global and local features, it can be acquired the state-of-the-art detector for 3D object detection. Nevertheless, regarding the previous work on feature representations, volumetric generation or point learning methods have difficulty building the relationships between local features and global features. Thus, we propose a multi-feature fusion network (MFFNet) to improve detection precision for 3D point cloud data by combining the global features from 3D voxel convolutions with the local features from the point learning network. Our algorithm is an end-to-end detection framework that contains a voxel convolutional module, a local point feature module and a detection head. Significantly, MFFNet constructs the local point feature set with point learning and sampling and the global feature map through 3D voxel convolution from raw point clouds. The detection head can use the obtained fusion feature to predict the position and category of the examined 3D object, so the proposed method can obtain higher precision than existing approaches. An experimental evaluation on the KITTI 3D object detection dataset obtain 97% MAP (Mean Average Precision) and Waymo Open dataset obtain 80% MAP, which proves the efficiency of the developed feature fusion representation method for 3D objects, and it can achieve satisfactory location accuracy.

    DOI: 10.1109/TITS.2022.3155488

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85129435512&origin=inward

  • Cross-Modal Dynamic Networks for Video Moment Retrieval with Text Query 査読有り

    Wang G., Xu X., Shen F., Lu H., Ji Y., Shen H.T.

    IEEE Transactions on Multimedia   24   1221 - 1232   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Video moment retrieval with text query aims to retrieve the most relevant segment from the whole video based on the given text query. It is a challenging cross-modal alignment task due to the huge gap between visual and linguistic modalities and the noise generated by manual labeling of time segments. Most of the existing works only use language information in the cross-modal fusion stage, neglecting that language information plays an important role in the retrieval stage. Besides, these works roughly compress the visual information in the video clips to reduce the computation cost which loses subtle video information in the long video. In this paper, we propose a novel model termed Cross-modal Dynamic Networks (CDN) which dynamically generates convolution kernel by visual and language features. In the feature extraction stage, we also propose a frame selection module to capture the subtle video information in the video segment. By this approach, the CDN can reduce the impact of the visual noise without significantly increasing the computation cost and leads to a better video moment retrieval result. The experiments on two challenge datasets, i.e., Charades-STA and TACoS, show that our proposed CDN method outperforms a bundle of state-of-the-art methods with more accurately retrieved moment video clips. The implementation code and extensive instruction of our proposed CDN method are provided at https://github.com/CFM-MSG/Code_CDN.

    DOI: 10.1109/TMM.2022.3142420

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85123367454&origin=inward

  • FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic Segmentation 査読有り

    Gao G., Xu G., Li J., Yu Y., Lu H., Yang J.

    IEEE Transactions on Multimedia   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Real-time semantic segmentation, which can be visually understood as the pixel-level classification task on the input image, currently has broad application prospects, especially in the fast-developing fields of autonomous driving and drone navigation. However, the huge burden of calculation together with redundant parameters are still the obstacles to its technological development. In this paper, we propose a Fast Bilateral Symmetrical Network (FBSNet) to alleviate the above challenges. Specifically, FBSNet employs a symmetrical encoder-decoder structure with two branches, semantic information branch and spatial detail branch. The Semantic Information Branch (SIB) is the main branch with semantic architecture to acquire the contextual information of the input image and meanwhile acquire sufficient receptive field. While the Spatial Detail Branch (SDB) is a shallow and simple network used to establish local dependencies of each pixel for preserving details, which is essential for restoring the original resolution during the decoding phase. Meanwhile, a Feature Aggregation Module (FAM) is designed to effectively combine the output of these two branches. Experimental results of Cityscapes and CamVid show that the proposed FBSNet can strike a good balance between accuracy and efficiency. Specifically, it obtains 70.9\% and 68.9\% mIoU along with the inference speed of 90 fps and 120 fps on these two test datasets, respectively, with only 0.62 million parameters on a single RTX 2080Ti GPU.

    DOI: 10.1109/TMM.2022.3157995

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85126303855&origin=inward

  • Global-PBNet: A Novel Point Cloud Registration for Autonomous Driving 査読有り

    Zheng Y., Li Y., Yang S., Lu H.

    IEEE Transactions on Intelligent Transportation Systems   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Registration performs an individual and deciding role in multiple intelligent transport systems. The advancement of deep-learning-based methods enhances the robustness and effectiveness of the preliminary registration stage, although the algorithm will effortlessly fall into local optima when improving the ultimate exactitude. Similarly, traditional method based on optimization has a more reliable performance in terms of precision. However, its performance still counts on the quality of initialization. In order to solve the above problems, we propose a PBNet that combines a point cloud network with a global optimization method. This framework uses the feature information of objects to perform high-precision rough registration and then searches the entire 3D motion space to implement branch-and-bound and iterative nearest point methods. The evaluation results show that PBNet significantly reduce the influence of initial values on registration and has good robustness against noise and outliers.

    DOI: 10.1109/TITS.2022.3153133

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85126284292&origin=inward

  • Integrity assessment of corroded oil and gas pipelines using machine learning: A systematic review 査読有り

    Soomro A.A., Mokhtar A.A., Kurnia J.C., Lashari N., Lu H., Sambo C.

    Engineering Failure Analysis   131   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:記事・総説・解説・論説等(学術雑誌)

    Hydrocarbon fluid integrity evaluation in oil and gas pipelines is important for anticipating HSE measures. Ignoring corrosion is unavoidable and may have severe personal, economic, and environmental consequences. To anticipate corrosion's unexpected behavior, most research relies on deterministic and probabilistic models. However, machine learning-based approaches are better suited to the complex and extensive nature of degraded oil and gas pipelines. Also, using machine learning to assess integrity is a new study field. As a result, the literature lacks a comprehensive evaluation of current research issues. This study's goal is to evaluate the current state of machine learning (methods, variables, and datasets) and propose future directions for practitioners and academics. Currently, machine learning techniques are favored for predicting the integrity of damaged oil and gas pipelines. ANN, SVM, and hybrid models outperform due to the combined strength of the constituent models. Given the benefits of both, most popular machine learning researchers favor hybrid models over standalone models. We found that most current research utilizes field data, simulation data, and experimental data, with field data being the most often used. Temperature, pH, pressure, and velocity are input characteristics that have been included in most studies, demonstrating their importance in corroded oil and gas pipeline integrity assessment. This study also identified research gaps and shortcomings such as data availability, accuracy, and validation. Finally, some future suggestions and recommendations are proposed.

    DOI: 10.1016/j.engfailanal.2021.105810

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85117770376&origin=inward

  • Study on the Learning in Intelligent Control Using Neural Networks Based on Back-Propagation and Differential Evolution 査読有り

    Mu S., Shibata S., Lu H., Yamamoto T., Nakashima S., Tanaka K.

    EAI/Springer Innovations in Communication and Computing   17 - 29   2022年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In order to obtain good control performance of ultrasonic motors in real applications, a study on the learning in intelligent control using neural networks (NN) based on differential evolution (DE) is reported in this chapter. To overcome the problems of characteristic variation and nonlinearity, an intelligent PID controller combined with DE type NN is studied. In the proposed method, an NN controller is designed for estimating the variation of PID gains, adjusting the control performance in PID controller to minimize the error. The learning of NN is implemented by DE in the update of the NN’s weights. By employing the proposed method, the characteristic changes and nonlinearity of USM can be compensated effectively. The effectiveness of the method is confirmed by experimental results.

    DOI: 10.1007/978-3-030-70451-3_2

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85113426057&origin=inward

  • A Two-Phase Learning-Based Swarm Optimizer for Large-Scale Optimization 査読有り

    Lan R., Zhu Y., Lu H., Liu Z., Luo X.

    IEEE Transactions on Cybernetics   51 ( 12 )   6284 - 6293   2021年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In this article, a simple yet effective method, called a two-phase learning-based swarm optimizer (TPLSO), is proposed for large-scale optimization. Inspired by the cooperative learning behavior in human society, mass learning and elite learning are involved in TPLSO. In the mass learning phase, TPLSO randomly selects three particles to form a study group and then adopts a competitive mechanism to update the members of the study group. Then, we sort all of the particles in the swarm and pick out the elite particles that have better fitness values. In the elite learning phase, the elite particles learn from each other to further search for more promising areas. The theoretical analysis of TPLSO exploration and exploitation abilities is performed and compared with several popular particle swarm optimizers. Comparative experiments on two widely used large-scale benchmark datasets demonstrate that the proposed TPLSO achieves better performance on diverse large-scale problems than several state-of-the-art algorithms.

    DOI: 10.1109/TCYB.2020.2968400

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85122207208&origin=inward

  • DRRS-BC: Decentralized Routing Registration System Based on Blockchain 査読有り

    Lu H., Tang Y., Sun Y.

    IEEE/CAA Journal of Automatica Sinica   8 ( 12 )   1868 - 1876   2021年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    The border gateway protocol (BGP) has become the indispensible infrastructure of the Internet as a typical inter-domain routing protocol. However, it is vulnerable to misconfigurations and malicious attacks since BGP does not provide enough authentication mechanism to the route advertisement. As a result, it has brought about many security incidents with huge economic losses. Exiting solutions to the routing security problem such as S-BGP, So-BGP, Ps-BGP, and RPKI, are based on the Public Key Infrastructure and face a high security risk from the centralized structure. In this paper, we propose the decentralized blockchain-based route registration framework-decentralized route registration system based on blockchain (DRRS-BC). In DRRS-BC, we produce a global transaction ledge by the information of address prefixes and autonomous system numbers between multiple organizations and ASs, which is maintained by all blockchain nodes and further used for authentication. By applying blockchain, DRRS-BC perfectly solves the problems of identity authentication, behavior authentication as well as the promotion and deployment problem rather than depending on the authentication center. Moreover, it resists to prefix and subprefix hijacking attacks and meets the performance and security requirements of route registration.

    DOI: 10.1109/JAS.2021.1004204

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112660306&origin=inward

  • CAA: Candidate-Aware Aggregation for Temporal Action Detection 査読有り

    Ren Y., Xu X., Shen F., Yao Y., Lu H.

    MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia   4930 - 4938   2021年10月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Temporal action detection aims to locate specific segments of action instances in an untrimmed video. Most existing approaches commonly extract the features of all candidate video segments and then classify them separately. However, they may neglect the underlying relationship among candidates unconsciously. In this paper, we propose a novel model termed Candidate-Aware Aggregation (CAA) to tackle this problem. In CAA, we design the Global Awareness (GA) module to exploit long-range relations among all candidates from a global perspective, which enhances the features of action instances. The GA module is then embedded into a multi-level hierarchical network named FENet, to aggregate local features in adjacent candidates to suppress background noise. As a result, the relationship among candidates is explicitly captured from both local and global perspectives, which ensures more accurate prediction results for the candidates. Extensive experiments conducted on two popular benchmarks ActivityNet-1.3 and THUMOS-14 demonstrate the superiority of CAA comparing to the state-of-the-art methods.

    DOI: 10.1145/3474085.3475616

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85119326285&origin=inward

  • Robust Facial Image Super-Resolution by Kernel Locality-Constrained Coupled-Layer Regression 査読有り

    Gao G., Zhu D., Lu H., Yu Y., Chang H., Yue D.

    ACM Transactions on Internet Technology   21 ( 3 )   2021年08月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Super-resolution methods for facial image via representation learning scheme have become very effective methods due to their efficiency. The key problem for the super-resolution of facial image is to reveal the latent relationship between the low-resolution (LR) and the corresponding high-resolution (HR) training patch pairs. To simultaneously utilize the contextual information of the target position and the manifold structure of the primitive HR space, in this work, we design a robust context-patch facial image super-resolution scheme via a kernel locality-constrained coupled-layer regression (KLC2LR) scheme to obtain the desired HR version from the acquired LR image. Here, KLC2LR proposes to acquire contextual surrounding patches to represent the target patch and adds an HR layer constraint to compensate the detail information. Additionally, KLC2LR desires to acquire more high-frequency information by searching for nearest neighbors in the HR sample space. We also utilize kernel function to map features in original low-dimensional space into a high-dimensional one to obtain potential nonlinear characteristics. Our compared experiments in the noisy and noiseless cases have verified that our suggested methodology performs better than many existing predominant facial image super-resolution methods.

    DOI: 10.1145/3418462

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85114284125&origin=inward

  • A Hybrid Feature Selection Algorithm Based on a Discrete Artificial Bee Colony for Parkinson's Diagnosis 査読有り

    Li H., Pun C.M., Xu F., Pan L., Zong R., Gao H., Lu H.

    ACM Transactions on Internet Technology   21 ( 3 )   2021年08月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Parkinson's disease is a neurodegenerative disease that affects millions of people around the world and cannot be cured fundamentally. Automatic identification of early Parkinson's disease on feature data sets is one of the most challenging medical tasks today. Many features in these datasets are useless or suffering from problems like noise, which affect the learning process and increase the computational burden. To ensure the optimal classification performance, this article proposes a hybrid feature selection algorithm based on an improved discrete artificial bee colony algorithm to improve the efficiency of feature selection. The algorithm combines the advantages of filters and wrappers to eliminate most of the uncorrelated or noisy features and determine the optimal subset of features. In the filter, three different variable ranking methods are employed to pre-rank the candidate features, then the population of artificial bee colony is initialized based on the significance degree of the re-rank features. In the wrapper part, the artificial bee colony algorithm evaluates individuals (feature subsets) based on the classification accuracy of the classifier to achieve the optimal feature subset. In addition, for the first time, we introduce a strategy that can automatically select the best classifier in the search framework more quickly. By comparing with several publicly available datasets, the proposed method achieves better performance than other state-of-the-art algorithms and can extract fewer effective features.

    DOI: 10.1145/3397161

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85099891928&origin=inward

  • Multi-Aspect Aware Session-Based Recommendation for Intelligent Transportation Services 査読有り

    Zhang Y., Li Y., Wang R., Hossain M.S., Lu H.

    IEEE Transactions on Intelligent Transportation Systems   22 ( 7 )   4696 - 4705   2021年07月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In the intelligent transportation system, the session data usually represents the users' demand. However, the traditional approaches only focus on the sequence information or the last item clicked by the user, which cannot fully represent user preferences. To address this issue, this paper proposes an Multi-aspect Aware Session-based Recommendation (MASR) model for intelligent transportation services, which comprehensively considers the user's personalized behavior from multiple aspects. In addition, it developed a concise and efficient transformer-style self-attention to analyze the sequence information of the current session, for accurately grasping the user's intention. Finally, the experimental results show that MASR is available to improve user satisfaction with more accurate and rapid recommendations, and reduce the number of user operations to decrease the safety risk during the transportation service.

    DOI: 10.1109/TITS.2020.2990214

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85110861948&origin=inward

  • Aspect-Based Sentiment Analysis of User Reviews in 5G Networks 査読有り

    Zhang Y., Lu H., Jiang C., Li X., Tian X.

    IEEE Network   35 ( 4 )   228 - 233   2021年07月

     詳細を見る

    記述言語:英語   掲載種別:記事・総説・解説・論説等(学術雑誌)

    Aspect-based sentiment analysis can help consumers provide clear and objective sentiment recommendations through massive amounts of data and is conducive to overcoming ambiguous human weaknesses in subjective judgments. However, the robustness and accuracy of existing sentiment analysis methods must still be improved. In this article, deep learning and machine learning techniques are combined to construct a sentiment analysis model based on ensemble learning ideas. Furthermore, the proposed model is applied to a sentiment classification for user reviews about restaurants, which are the representative location-based and user-oriented applications in 5G networks. Specifically, a multi-aspect-labeling model is established, and an ensemble aspect-based model is proposed based on the concept of ensemble learning to predict the consumer's true consumption feelings and willingness to consume again, and to improve machine learning based on the developed model. The predictive performance of the algorithm lies within a single domain.

    DOI: 10.1109/MNET.011.2000400

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112600992&origin=inward

  • A Model for Joint Planning of Production and Distribution of Fresh Produce in Agricultural Internet of Things 査読有り

    Han J., Lin N., Ruan J., Wang X., Wei W., Lu H.

    IEEE Internet of Things Journal   8 ( 12 )   9683 - 9696   2021年06月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    The production and distribution planning of fresh produce is a complex optimization problem, which is affected by many factors, including its perishable characteristics. Farmers cannot guarantee the efficiency and accuracy of production and distribution decisions. Given the close relationship between the production and distribution of annual fresh produce, the intention of our research is to solve the two-stage joint planning problem and maximize the revenue of farmers ultimately. The internal relationship matrix between the two links of production and distribution is established. On this basis, we propose a mixed-integer programming (MIP) model, which covers the constraints of labor and capital. The decisions obtained are not only based on price estimation and resource availability but also on the impact of the agricultural Internet-of-Things technology and the special requirements of each distribution channel. Numerical experiments demonstrate that when the planting area is 1, 4, and 6 ha, the proposed joint planning model can improve the distribution revenue of farmers by 7.92%, 4.15%, and 4.94%, respectively, compared with the traditional separate decision-making approach of distribution. According to different decision scenarios, management insights have been obtained. For example, farmers should carefully sort and package products as well as choose a timely and safe third-party express delivery company. Additionally, the proposed strategy can evaluate the impact of distribution channels on farmers' revenue.

    DOI: 10.1109/JIOT.2020.3037729

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85098776034&origin=inward

  • CT temporal subtraction: Techniques and clinical applications 査読有り

    Aoki T., Kamiya T., Lu H., Terasawa T., Ueno M., Hayashida Y., Murakami S., Korogi Y.

    Quantitative Imaging in Medicine and Surgery   11 ( 6 )   2021年06月

     詳細を見る

    記述言語:英語   掲載種別:記事・総説・解説・論説等(学術雑誌)

    DOI: 10.21037/qims-20-1367

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85104634131&origin=inward

  • Output-Bounded and RBFNN-Based Position Tracking and Adaptive Force Control for Security Tele-Surgery 査読有り

    Wang T., Ji X., Song A., Madani K., Chohra A., Lu H., Monero R.

    ACM Transactions on Multimedia Computing, Communications and Applications   17 ( 2s )   2021年06月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    In security e-health brain neurosurgery, one of the important processes is to move the electrocoagulation to the appropriate position in order to excavate the diseased tissue.1 However, it has been problematic for surgeons to freely operate the electrocoagulation, as the workspace is very narrow in the brain. Due to the precision, vulnerability, and important function of brain tissues, it is essential to ensure the precision and safety of brain tissues surrounding the diseased part. The present study proposes the use of a robot-assisted tele-surgery system to accomplish the process. With the aim to achieve accuracy, an output-bounded and RBF neural network-based bilateral position control method was designed to guarantee the stability and accuracy of the operation process. For the purpose of accomplishing a minimal amount of bleeding and damage, an adaptive force control of the slave manipulator was proposed, allowing it to be appropriate to contact the susceptible vessels, nerves, and brain tissues. The stability was analyzed, and the numerical simulation results revealed the high performance of the proposed controls.

    DOI: 10.1145/3394920

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85108542424&origin=inward

  • User-Oriented Virtual Mobile Network Resource Management for Vehicle Communications 査読有り

    Lu H., Zhang Y., Li Y., Jiang C., Abbas H.

    IEEE Transactions on Intelligent Transportation Systems   22 ( 6 )   3521 - 3532   2021年06月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Currently, advanced communications and networks greatly enhance user experiences and have a major impact on all aspects of people's lifestyles in terms of work, society, and the economy. However improving competitiveness and sustainable vehicle network services, such as higher user experience, considerable resource utilization and effective personalized services, is a great challenge. Addressing these issues, this paper proposes a virtual network resource management based on user behavior to further optimize the existing vehicle communications. In particular, ensemble learning is implemented in the proposed scheme to predict the user's voice call duration and traffic usage for supporting user-centric mobile services optimization. Sufficient experiments show that the proposed scheme can significantly improve the quality of services and experiences and that it provides a novel idea for optimizing vehicle networks.

    DOI: 10.1109/TITS.2020.2991766

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85104993954&origin=inward

  • ATTDC: An Active and Traceable Trust Data Collection Scheme for Industrial Security in Smart Cities 査読有り

    Shen M., Liu A., Huang G., Xiong N.N., Lu H.

    IEEE Internet of Things Journal   8 ( 8 )   6437 - 6453   2021年04月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    With billions of sensing devices are deployed in smart cities to monitor regions of interests and collect large sensing data, the Internet-of-Things (IoT) applications are being widely used in various fields and empower the intelligent smart cities. Due to the smart decision made by IoT applications depends on the reliability of data collection, it is pivotal to collect data from the trust sensing devices. However, how to identify the credibility of sensor nodes to ensure the credibility of data collection is a challenge issue. In this article, an active and traceable trust-based data collection (ATTDC) scheme is proposed to collect trust data in Internet of Thing. The main contribution of this article are as follows: 1) an active trust framework is proposed to quickly obtain the trustworthiness of sensor nodes by using unmanned aerial vehicles (UAVs) with a piggybacking method; 2) in order to accurately obtain the trust degree of the sensor nodes, a traceable trust method of obtaining is proposed in which nodes in the network send data packets by digital signature, Tracing suspicious nodes according to data routing paths to obtain active trust, therefore, the acquisition cost of the network can be effectively reduced; and 3) In order to reduce the acquisition cost of UAV, an ant colony algorithm-based flight path algorithm is designed to reduce the flight path of UAV, and obtain the credibility evaluation of as many nodes as possible. The experimental results show that the ATTDC scheme proposed in this article can identify the trust of the sensing nodes faster and more accurately, ensuring the credibility of data collection.

    DOI: 10.1109/JIOT.2021.3049173

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85099410892&origin=inward

  • A supervoxel classification based method for multi-organ segmentation from abdominal ct images 査読有り

    Wu J., Li G., Lu H., Kamiya T.

    Journal of Image and Graphics(United Kingdom)   9 ( 1 )   9 - 14   2021年03月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Multi-organ segmentation is a critical step in Computer-Aided Diagnosis (CAD) system. We proposed a novel method for automatic abdominal multi-organ segmentation by introducing spatial information in the process of supervoxel classification. Supervoxels with boundaries adjacent to anatomical edges are separated from the image by using the Simple Linear Iterative Clustering (SLIC) from the images. Then a random forest classifier is built to predict the labels of the supervoxels according to their spatial and intensity features. Thirty abdominal CT images are used in the experiment of segmentation task for spleen, right kidney, left kidney, and liver region. The experiment result shows that the proposed method achieves a higher accuracy of segmentation compares to our previous model-based method.

    DOI: 10.18178/joig.9.1.9-14

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85109040224&origin=inward

  • 5G-Network-Enabled Smart Ambulance: Architecture, Application, and Evaluation 査読有り

    Zhai Y., Xu X., Chen B., Lu H., Wang Y., Li S., Shi X., Wang W., Shang L., Zhao J.

    IEEE Network   35 ( 1 )   190 - 196   2021年03月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    As the fifth generation (5G) network comes to the fore, the realization of 5G-enabled service has attracted much attention from both healthcare academics and practitioners. In particular, 5G-enabled emergency ambulance service allows to connect a patient and an ambulance crew at an accident scene or in transit with the awaiting emergency department team at the destination hospital seamlessly so as to improve the rescue rate of patients. However, the application of the 5G network in ambulance service currently lacks a reliable solution and simulation testing of performance in the existing literature. To achieve this end, the primary aim of this study is to propose a 5G-enabled smart ambulance service and then test the quality of service of the proposed solution in experimental settings. We also consider emergency scenarios to investigate the task completion and accuracy of 5G-enabled smart ambulance, and to verify the superiority of our proposed solution. Our study explores the value of a 5G-en-abled smart ambulance and provides practical insights for 5G network construction, business development, and network optimization of smart ambulance service.

    DOI: 10.1109/MNET.011.2000014

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85101132108&origin=inward

  • Depth-Distilled Multi-focus Image Fusion 査読有り

    Zhao F., Zhao W., Lu H., Liu Y., Yao L., Liu Y.

    IEEE Transactions on Multimedia   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Homogeneous regions, which are smooth areas that lack blur clues to discriminate if they are focused or non-focused. Therefore, they bring a great challenge to achieve high accurate multi-focus image fusion (MFIF). Fortunately, we observe that depth maps are highly related to focus and out of focus, thereby containing a preponderance of discriminative power to locate homogeneous regions. This offers the potential to provide additional depth cues to assist MFIF task. Taking depth cues into consideration, in this paper, we propose a new depth-distilled multi-focus image fusion framework, namely D2MFIF. In D2MFIF, depth-distilled model (DDM) is designed for adaptively transferring the depth knowledge into MFIF task, gradually improving MFIF performance. Moreover, multi-level decision map fusion mechanism is designed to integrate multi- level decision maps from intermediate outputs for improving the final prediction. Visually and quantitatively experimental results demonstrate the superiority of our method over several state-of-the-art methods. Codes and models will be released.

    DOI: 10.1109/TMM.2021.3134565

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85121772794&origin=inward

  • Characteristics based visual servo for 6DOF robot arm control 査読有り

    Tsuchida S., Lu H., Kamiya T., Serikawa S.

    Cognitive Robotics   1   76 - 82   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Visual servo is a method for robot arm motion control. It is controlled by the end effector velocity that is the result in calculating the internal Jacobi matrix and vector of feature error. In general, automatic robotic task require high quality sensor that can measure a 3-dimential distance, and do calibration in order to suit the sensor frame and robot frame in Euclidean space. In this paper, we only use RGB camera as the data collection, which not requiring the calibration in sensor frame. Thus, our method is simpler than any other automatic motion methods. Meanwhile, the proposed characteristics based visual servo method has varying the hyper parameter, and show the effectiveness for indicating the precision of pose error both simulation and actual environments.

    DOI: 10.1016/j.cogr.2021.06.002

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85125125810&origin=inward

  • Robust Motion Averaging under Maximum Correntropy Criterion 査読有り

    Zhu J., Hu J., Lu H., Chen B., Li Z., Li Y.

    Proceedings - IEEE International Conference on Robotics and Automation   2021-May   5283 - 5288   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Recently, the motion averaging method has been introduced as an effective means to solve the multi-view registration problem. This method aims to recover global motions from a set of relative motions, where the original method is sensitive to outliers due to using the Frobenius norm error in the optimization. Accordingly, this paper proposes a novel robust motion averaging method based on the maximum correntropy criterion (MCC). Specifically, the correntropy measure is used instead of utilizing Frobenius norm error to improve the robustness of motion averaging against outliers. According to the half-quadratic technique, the correntropy measure based optimization problem can be solved by the alternating minimization procedure, which includes operations of weight assignment and weighted motion averaging. Further, we design a selection strategy of adaptive kernel width to take advantage of correntropy. Experimental results on benchmark data sets illustrate that our method has superior performance on accuracy and robustness for multi-view registration. What's more, it can be applied to robot mapping.

    DOI: 10.1109/ICRA48506.2021.9561406

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85125433598&origin=inward

  • Partial feature selection and alignment for multi-source domain adaptation 査読有り

    Fu Y., Zhang M., Xu X., Cao Z., Ma C., Ji Y., Zuo K., Lu H.

    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition   16649 - 16658   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Multi-Source Domain Adaptation (MSDA), which dedicates to transfer the knowledge learned from multiple source domains to an unlabeled target domain, has drawn increasing attention in the research community. By assuming that the source and target domains share consistent key feature representations and identical label space, existing studies on MSDA typically utilize the entire union set of features from both the source and target domains to obtain the feature map and align the map for each category and domain. However, the default setting of MSDA may neglect the issue of “partialness”, i.e., 1) a part of the features contained in the union set of multiple source domains may not present in the target domain; 2) the label space of the target domain may not completely overlap with the multiple source domains. In this paper, we unify the above two cases to a more generalized MSDA task as Multi-Source Partial Domain Adaptation (MSPDA). We propose a novel model termed Partial Feature Selection and Alignment (PFSA) to jointly cope with both MSDA and MSPDA tasks. Specifically, we firstly employ a feature selection vector based on the correlation among the features of multiple sources and target domains. We then design three effective feature alignment losses to jointly align the selected features by preserving the domain information of the data sample clusters in the same category and the discrimination between different classes. Extensive experiments on various benchmark datasets for both MSDA and MSPDA tasks demonstrate that our proposed PFSA approach remarkably outperforms the state-of-the-art MSDA and unimodal PDA methods.

    DOI: 10.1109/CVPR46437.2021.01638

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85124186310&origin=inward

  • Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning 査読有り

    Zhang J., Xu X., Shen F., Lu H., Liu X., Shen H.T.

    35th AAAI Conference on Artificial Intelligence, AAAI 2021   4B   3351 - 3359   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    The recent success of audio-visual representations learning can be largely attributed to their pervasive concurrency property, which can be used as a self-supervision signal and extract correlation information. While most recent works focus on capturing the shared associations between the audio and visual modalities, they rarely consider multiple audio and video pairs at once and pay little attention to exploiting the valuable information within each modality. To tackle this problem, we propose a novel audio-visual representation learning method dubbed self-supervised curriculum learning (SSCL) under the teacher-student learning manner. Specifically, taking advantage of contrastive learning, a two-stage scheme is exploited, which transfers the cross-modal information between teacher and student model as a phased process. The proposed SSCL approach regards the pervasive property of audiovisual concurrency as latent supervision and mutually distills the structure knowledge of visual to audio data. Notably, the SSCL method can learn discriminative audio and visual representations for various downstream applications. Extensive experiments conducted on both action video recognition and audio sound recognition tasks show the remarkably improved performance of the SSCL method compared with the state-of-the-art self-supervised audio-visual representation learning methods.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85124551164&origin=inward

  • Action Recognition Framework in Traffic Scene for Autonomous Driving System 査読有り

    Xu F., Xu F., Xie J., Pun C.M., Lu H., Gao H.

    IEEE Transactions on Intelligent Transportation Systems   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    For the autonomous driving system, accurately recognizing the actions of different roles in the traffic scene is the prerequisite for realizing this kind of human-vehicle information interaction. In this paper, we propose a complete framework based on 3D human pose estimation to recognize the actions of different roles on the road. The main objects recognized include traffic police, cyclists, and some passersby in need. We perform action recognition based on a dynamic adaptive graph convolutional network, which can realize the action recognition of objects based on 3D human pose. In addition to the action recognition module, we have optimized both the object detection module and the human pose estimation module in the framework so that the framework can handle multiple objects at the same time, which can be closer to the real traffic scene. To realize complex and changeable human action recognition, we built a multi-view camera system to collect responsible 3D human pose datasets containing traffic police gestures, cyclist gestures, and pedestrians' body movements. In the experiments, compared to other state-of-the-art researches, the proposed framework can achieve comparable results with the same dataset. Satisfactory performance has also been obtained on the real data we collected, which can handle a variety of different action recognition tasks at the same time.

    DOI: 10.1109/TITS.2021.3135251

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85122066803&origin=inward

  • Improved Point-Voxel Region Convolutional Neural Network: 3D Object Detectors for Autonomous Driving 査読有り

    Li Y., Yang S., Zheng Y., Lu H.

    IEEE Transactions on Intelligent Transportation Systems   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Recently, 3D object detection based on deep learning has achieved impressive performance in complex indoor and outdoor scenes. Among the methods, the two-stage detection method performs the best; however, this method still needs improved accuracy and efficiency, especially for small size objects or autonomous driving scenes. In this paper, we propose an improved 3D object detection method based on a two-stage detector called the Improved Point-Voxel Region Convolutional Neural Network (IPV-RCNN). Our proposed method contains online training for data augmentation, upsampling convolution and k-means clustering for the bounding box to achieve 3D detection tasks from raw point clouds. The evaluation results on the KITTI 3D dataset show that the IPV-RCNN achieved a 96% mAP, which is 3% more accurate than the state-of-the-art detectors.

    DOI: 10.1109/TITS.2021.3071790

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85107209244&origin=inward

  • Image quality improvement using local adaptive neighborhood-based dark channel prior 査読有り

    Onoyama T., Lu H., Soomro A.A., Mokhtar A.A., Kamiya T., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   11884   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In-vehicle cameras and surveillance cameras are used in many situations in our daily lives. Visibility degradation in foggy environments is caused by the scattering of reflected light from real objects by minute water droplets or fog in the medium through which light passes. The degree of degradation depends on the density of suspended microparticles existing between the observed object and the observation point in the medium. In general, the farther the object is from the camera, the more it is affected by the fog. The purpose of image de-fogging is to improve the clarity of an object by removing the effects of fog in the image.

    DOI: 10.1117/12.2603771

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85120496362&origin=inward

  • Shape restoration by shadow information and photometric stereo 査読有り

    Kameda S., Lu H., Kamiya T., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   11884   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Virtual Reality (VR) systems are become popular in recent years, and the capture 3D objects from the real world have been studied. 3D objects have large and complex data. In this paper, we propose a novel method that use the shadow information and Photometric stereo with their surface from one point of view to recover the 3D shapes. The experimental results show that the proposed method performs well accuracy.

    DOI: 10.1117/12.2604193

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85120506573&origin=inward

  • Underwater image super-resolution using SRCNN 査読有り

    Ooyama S., Lu H., Kamiya T., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   11884   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, energy minerals have become more important due to the rapid industrialization worldwide. Due to the rapid industrialization on a global scale, there is a shortage of mineral resources, and there are more opportunities to rely on alternative energy sources. Therefore, the exploration of marine resources, which are abundant in the ocean, is being promoted. However, it is dangerous and impractical for humans to dive and search for marine resources by hand. Therefore, it is possible to proceed with underwater exploration safely by having a robot do the work instead. Robots have been used as a mainstream search tool in the underwater environment due to the existence of various hazardous environmental conditions. However, there are several problems associated with robot control in underwater environments, one of which is poor visibility in the water. One of the problems is the poor visibility in the water. To improve the visibility in the water, we are trying to increase the resolution of underwater images by using super-resolution technology. In this paper, we conduct experiments using SRCNN, which is a basic super-resolution technique for underwater images. In addition, we investigate the effectiveness of "Mish", which has been attracting attention in recent years for its potential to surpass the performance of "ReLU", although "ReLU"is a typical activation function of neural networks, on SRCNN.

    DOI: 10.1117/12.2603761

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85120500372&origin=inward

  • A novel single image reflection removal method 査読有り

    Ishiyama S., Lu H., Soomro A.A., Mokhtar A.A.

    Proceedings of SPIE - The International Society for Optical Engineering   11884   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, reflection is a kind of noise in images which is frequently generated by reflections from windows, glasses and so on when you take pictures or movies. The reflection does not only degrade the image quality, but also affects computer vision tasks such as object detection and segmentation. In SIRR, learning models are often used because various patterns of reflection are possible, and the versatility of the model is required. In this study, we propose a deep learning model for SIRR. There are two problems with the conventional SIRR using deep learning models. The assumed scenes of reflection are vary, and there is little training data because it is difficult to obtain true values. In this study, we focus on the latter and propose an SIRR based on meta-learning. In this study, we adopt MAML, which is one of the methods of meta-learning. In this study, we propose an SIRR using a deep learning model with MAML, which is one of the methods of meta-learning. The deep learning model includes the Iterative Boost Convolutional LSTM Network (IBCLN) is adopted as the deep learning methods. Proposed method improve accuracy compared with conventional method of state-of-the-art result in SIRR.

    DOI: 10.1117/12.2604356

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85120479363&origin=inward

  • scSE-CRNNと3種類の呼吸音変換画像による呼吸音の分類 査読有り

    浅谷 尚希, 陸 慧敏, 神谷 亨, 間普 真吾, 木戸 尚治

    医用画像情報学会雑誌 ( 医用画像情報学会 )   38 ( 4 )   152 - 159   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Due to the respiratory diseases such as chronic obstructive pulmonary disease and lower respiratory tract infections nearly 8 million people were died worldwide each year. Reducing the number of deaths from respiratory diseases is a challenge to be solved worldwide. Early detection is the most efficient way to reduce the number of deaths in respiratory illness. As a result, the spread of infection can be suppressed, and the therapeutic effect can be enhanced. Currently, auscultation is performed as a promising method for early detection of respiratory diseases. Auscultation can estimate respiratory diseases by distinguishing abnormal sounds contained in respiratory sounds. However, medical staff need to be trained to perform auscultation with high accuracy. Also, the diagnostic results depend on each staff subjectively, which can lead to inconsistent results. Therefore, in some environments, a shortage of specialized health care workers can lead to the spread of respiratory illness. To solve this problem, an application that analyzes respiratory sounds and outputs diagnostic results is needed. In this paper, we use a newly proposed deep learning model to automatically classify the respiratory sound data from the ICBHI 2017 Challenge Dataset. Short-Time Fourier Transform, Constant-Q Transform, and Continuous Wavelet Transform are applied to the respiratory sound data to convert it into the time-frequency region. Then, the obtained three types of breath sound images are input to CRNN (Convolutional Recurrent Neural Network) having scSE (Spatial and Channel Squeeze & Excitation) Block. The accuracy is improved by weighting the features of each image. As a result, AUC (Area Under the Curve): (Normal:0.87, Crackle:0.88, Wheeze:0.92, Both:0.89), Sensitivity: 0.67, Specificity: 0.82, Average Score: 0.75, Harmonic Score: 0.74, Accuracy: 0.75 were obtained.</p>

    DOI: 10.11318/mii.38.152

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130008136055

  • GRAPH CONVOLUTIONAL HOURGLASS NETWORKS FOR SKELETON-BASED ACTION RECOGNITION 査読有り

    Zhu Y., Xu X., Ji Y., Shen F., Shen H.T., Lu H.

    Proceedings - IEEE International Conference on Multimedia and Expo   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Graph convolution networks (GCNs) have become the mainstream framework for the skeleton-based action recognition task, since the skeleton representation of human action can be naturally modeled by the graph structure. Generally, most of the existing GCN based models extract and aggregate skeleton features by exploiting single-scale joint information, while neglecting the valuable multi-scale information such as part and body features in the skeleton. To address this issue, we propose a novel Graph Convolutional Hourglass Network (GCHN) model, which is scalable by stacking several basic modules of Graph Convolutional Hourglass Block (GCHB). Each GCHB module consists of the sequential operations of graph convolution, graph pooling and graph unpooling, which can promote the interaction of multi-scale information in the skeleton and effectively improve the recognition performance. Extensive experiments on the challenging NTU-RGB+D and Kinetics-Skeleton datasets demonstrate that the proposed GCHN model achieves state-of-the-art performance.

    DOI: 10.1109/ICME51207.2021.9428355

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85126488341&origin=inward

  • LIGHTWEIGHT IMAGE SUPER-RESOLUTION WITH MULTI-SCALE FEATURE INTERACTION NETWORK 査読有り

    Wang Z., Gao G., Li J., Yu Y., Lu H.

    Proceedings - IEEE International Conference on Multimedia and Expo   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Recently, the single image super-resolution (SISR) approaches with deep and complex convolutional neural network structures have achieved promising performance. However, those methods improve the performance at the cost of higher memory consumption, which is difficult to be applied for some mobile devices with limited storage and computing resources. To solve this problem, we present a lightweight multi-scale feature interaction network (MSFIN). For lightweight SISR, MSFIN expands the receptive field and adequately exploits the informative features of the low-resolution observed images from various scales and interactive connections. In addition, we design a lightweight recurrent residual channel attention block (RRCAB) so that the network can benefit from the channel attention mechanism while being sufficiently lightweight. Extensive experiments on some benchmarks have confirmed that our proposed MSFIN can achieve comparable performance against the state-of-the-arts with a more lightweight model.

    DOI: 10.1109/ICME51207.2021.9428136

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85126441097&origin=inward

  • MULTIMODAL TRANSFORMER NETWORKS WITH LATENT INTERACTION FOR AUDIO-VISUAL EVENT LOCALIZATION 査読有り

    He Y., Xu X., Liu X., Ou W., Lu H.

    Proceedings - IEEE International Conference on Multimedia and Expo   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    The task of audio-visual event localization (AVEL) aims to localize a visible and audible event in a video. Previous methods first divide a video into segments and then fuse visual and acoustic features at the segment level via a co-attention mechanism. However, existing methods mostly model relations between individual visual and audio segments in a limitedly short period, which may not cover a longer video duration for better high-level event information modeling. In this paper, we proposed a novel model termed Multimodal Transformer Network with Latent Interaction (MTNLI) to tackle this problem. The proposed MTNLI model employs a multimodal Transformer structure to learn the cross-modality relationships between latent visual and audio summarizations in long segment sequences, which summarize the visual and audio segments into a small number of latent representations to avoid modeling uninformative individual visual-audio relations. The cross-modality information between the latent summarizations is propagated to fuse valuable information from both modalities, which can effectively handle large temporal inconsistent between vision and audio. Our MTNLI method achieves state-of-the-art performance on the benchmark AVE (Audio-Visual Event) dataset for the event localization task.

    DOI: 10.1109/ICME51207.2021.9428081

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85119152207&origin=inward

  • Sentence pair modeling based on semantic feature map for human interaction with IoT devices 査読有り

    Yu R., Lu W., Lu H., Wang S., Li F., Zhang X., Yu J.

    International Journal of Machine Learning and Cybernetics   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    The rapid development of Internet of Things (IoT) brings an urgent requirement on intelligent human–device interactions using natural language, which are critical for facilitating people to use IoT devices. The efficient interactive approaches depend on various natural language understanding technologies. Among them, sentence pair modeling (SPM) is essential, where neural networks have achieved great success in SPM area due to their powerful abilities in feature extraction and representation. However, as sentences are one-dimensional (1D) texts, the available neural networks are usually limited to 1D sequential models, which prevents the performance improvement of SPM task. To address this gap, in this paper, we propose a novel neural architecture for sentence pair modeling, which utilizes 1D sentences to construct multi-dimensional feature maps similar to images containing multiple color channels. Based on the feature maps, more kinds of neural models become applicable on SPM task, including 2D CNN. In the proposed model, first, the sentence on a specific granularity is encoded with BiLSTM to generate the representation on this granularity, which is viewed as a special channel of the sentence. The representations from different granularity are merged together to construct semantic feature map of the input sentence. Then, 2D CNN is employed to encode the feature map to capture the deeper semantic features contained in the sentence. Next, another 2D CNN is utilized to capture the interactive matching features between sentences, followed by 2D max-pooling and attention mechanism to generate the final matching representation. Finally, the matching degree of sentences are judged with a sigmoid function according to the matching representation. Extensive experiments are conducted on two real-world data sets. In comparison with benchmarks, the proposed model achieved remarkable results, and performed better or comparably with BERT-based models. Our work is beneficial to building a more powerful humanized interaction system with IoT devices.

    DOI: 10.1007/s13042-021-01349-x

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85107850869&origin=inward

  • Deep-learning based segmentation algorithm for defect detection in magnetic particle testing images 査読有り

    Ueda A., Lu H., Kamiya T.

    Proceedings of International Conference on Artificial Life and Robotics   2021   235 - 238   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Magnetic Particle Testing (MPT), also referred to as magnetic particle inspection, is a nondestructive examination (NDE) technique used to detect surface and slightly subsurface flaws in most ferromagnetic materials such as iron, nickel, and cobalt, and some of their alloys. In a bad environment, the procedure is complicated, and automation of MPT is strongly desired. To find defects in the formed magnetic powder pattern, it is required to be highly skilled and automation has been considered difficult. In recent years, many defect detection methods based on deep learning have been proposed, and the effectiveness of deep learning has been shown in the task of automatically detecting various types of defects having different shapes and sizes. In this paper, we describe the development of deep learning based segmentation algorithm for defect detection in MPT images. We have achieved a F2 score of 84.04% by using U-Net as the segmentation model and by utilizing a strong backbone network and an optimal loss function.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85108854991&origin=inward

  • Discrete-Time Predictive Sliding Mode Control for a Constrained Parallel Micropositioning Piezostage 査読有り

    Kang S., Wu H., Yang X., Li Y., Yao J., Chen B., Lu H.

    IEEE Transactions on Systems, Man, and Cybernetics: Systems   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    This article proposes a new discrete-time predictive sliding mode control (DPSMC) for a parallel micropositioning piezostage to improve the motion accuracy in the presence of cross-coupling hysteresis nonlinearities and input constraints. Unlike the traditional linear discrete-time sliding mode control (DSMC), the proposed DPSMC is chattering free and has a faster convergence rate thanks to the design of a nonlinear discrete-time fast integral terminal sliding mode surface. Moreover, by combining with the receding horizon optimization, the sliding mode state is predicted to follow the expected trajectory of a predefined continuous sliding mode reaching law, which also allows the proposed controller to explicitly deal with constraints. The stability of the closed-loop system is analyzed under the model disturbances and constraints, and proves that the proposed DPSMC can offer a smaller quasi-sliding mode bandwidth than the traditional DSMC. The effectiveness of the proposed controller is validated by a series of numerical simulations and experiments. Results demonstrate the advantages of proposed DPSMC over the traditional DSMC method.

    DOI: 10.1109/TSMC.2021.3062581

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85102644075&origin=inward

  • Generalized Label Enhancement with Sample Correlations 査読有り

    Zheng Q., Zhu J., Tang H., Liu X., Li Z., Lu H.

    IEEE Transactions on Knowledge and Data Engineering   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Recently, label distribution learning (LDL) has drawn much attention in machine learning, where LDL model is learned from labelel instances. Different from single-label and multi-label annotations, label distributions describe the instance by multiple labels with different intensities and accommodate to more general scenes. Since most existing machine learning datasets merely provide logical labels, label distributions are unavailable in many real-world applications. To handle this problem, we propose two novel label enhancement methods, i.e., Label Enhancement with Sample Correlations (LESC) and generalized Label Enhancement with Sample Correlations (gLESC). More specifically, LESC employs a low-rank representation of samples in the feature space, and gLESC leverages a tensor multi-rank minimization to further investigate the sample correlations in both the feature space and label space. Benefitting from the sample correlations, the proposed methods can boost the performance of label enhancement. Extensive experiments on 14 benchmark datasets demonstrate the effectiveness and superiority of our methods.

    DOI: 10.1109/TKDE.2021.3073157

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85104234576&origin=inward

  • HPSS を用いた呼吸音の自動分類 査読有り

    丸橋 優生, 浅谷 尚希, 陸 慧敏, 神谷 亨, 間普 真吾, 木戸 尚治

    医用画像情報学会雑誌 ( 医用画像情報学会 )   38 ( 2 )   95 - 100   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Respiratory disease is a serious illness that accounts for three of the top ten causes of death in the world, and approximately eight million people died worldwide each year. Early detection and early treatment are important for the prevention of illness due to these diseases. Currently, auscultation is performed for the diagnosis of respiratory diseases,however there is a problem that quantitative diagnosis is difficult. Therefore, in this paper, we propose a new automatic classification method of respiratory sounds to support the diagnosis of respiratory diseases on auscultation. In the proposed method, respiratory sound data is converted into a spectrogram image by applying the short-time Fourier transform. Then,we apply HPSS (Harmonic/Percussive Sound Separation) algorithm to the respiratory sound spectrogram to separate it into a harmonic spectrogram and a percussive spectrogram. The three generated spectrograms are used for classification of respiratory sounds by CNN (Convolutional Neural Network) and SVM (Support Vector Machine) classifiers. Our proposed method obtained superior classification performance compared to the case without applying HPSS and satisfactory results are obtained.</p>

    DOI: 10.11318/mii.38.95

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130008062643

  • Image-Scale-Symmetric Cooperative Network for Defocus Blur Detection 査読有り

    Zhao F., Lu H., Zhao W., Yao L.

    IEEE Transactions on Circuits and Systems for Video Technology   2021年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Defocus blur detection (DBD) for natural images is a challenging vision task especially in the presence of homogeneous regions and gradual boundaries. In this paper, we propose a novel image-scale-symmetric cooperative network (IS2CNet) for DBD. On one hand, in the process of image scales from large to small, IS2CNet gradually spreads the recept of image content. Thus, the homogeneous region detection map can be optimized gradually. On the other hand, in the process of image scales from small to large, IS2CNet gradually feels the high-resolution image content, thereby gradually refining transition region detection. In addition, we propose a hierarchical feature integration and bi-directional delivering mechanism to transfer the hierarchical feature of previous image scale network to the input and tail of the current image scale network for guiding the current image scale network to better learn the residual. The proposed approach achieves state-of-the-art performance on existing datasets. Codes and results are available at: https://github.com/wdzhao123/IS2CNet.

    DOI: 10.1109/TCSVT.2021.3095347

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85110906905&origin=inward

  • Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency 査読有り

    Zhou Y., Xu X., Shen F., Gao L., Lu H., Shen H.T.

    MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia   475 - 483   2020年10月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Recently, developing temporally consistent video-based processing techniques has drawn increasing attention due to the defective extend-ability of existing image-based processing algorithms (e.g., filtering, enhancement, colorization, etc). Generally, applying these image-based algorithms independently to each video frame typically leads to temporal flickering due to the global instability of these algorithms. In this paper, we consider enforcing temporal consistency in a video as a temporal denoising problem that removing the flickering effect in given unstable pre-processed frames. Specifically, we propose a novel model termed Temporal Denoising Mask Synthesis Network (TDMS-Net) that jointly predicts the motion mask, soft optical flow and the refining mask to synthesize the temporal consistent frames. The temporal consistency is learned from the original video and the learned temporal features are applied to reprocess the output frames that are agnostic (blind) to specific image-based processing algorithms. Experimental results on two datasets for 16 different applications demonstrate that the proposed TDMS-Net significantly outperforms two state-of-the-art blind temporal consistency approaches.

    DOI: 10.1145/3394171.3413788

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85106738511&origin=inward

  • Marine Organisms Tracking and Recognizing Using YOLO 査読有り

    Uemura T., Lu H., Kim H.

    EAI/Springer Innovations in Communication and Computing   53 - 58   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    A system that investigates deep sea automatically has never developed. A purpose of this study is developing such a system. We employed a technique of recognition and tracking of multi-objects, called “You Only Look Once: YOLO.” This method provides us very fast and accurate tracker. In our system, we remove the haze, which is caused by turbidity of water, from image. After its process, we apply “YOLO” to tracking and recognizing the marine organisms, which includes shrimp, squid, crab, and shark. Our developed system shows generally satisfactory performance.

    DOI: 10.1007/978-3-030-17763-8_6

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85097568303&origin=inward

  • Multi-Level Chaotic Maps for 3D Textured Model Encryption 査読有り

    Jin X., Zhu S., Wu L., Zhao G., Li X., Zhou Q., Lu H.

    EAI/Springer Innovations in Communication and Computing   107 - 117   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    With the rapid progress of virtual reality and augmented reality technologies, 3D contents are the next widespread media in many applications. Thus, the protection of 3D models is primarily important. Encryption of 3D models is essential to maintain confidentiality. Previous work on encryption of 3D surface model often considers the point clouds, the meshes, and the textures individually. In this work, a multi-level chaotic maps model for 3D textured encryption was presented by observing the different contributions for recognizing cipher 3D models between vertices (point cloud), polygons, and textures. For vertices which make main contribution for recognizing, we use high-level 3D Lu chaotic map to encrypt them. For polygons and textures which make relatively smaller contributions for recognizing, we use 2D Arnold’s cat map and 1D logistic map to encrypt them, respectively. The experimental results show that our method can get similar performance with the other method and use the same high-level chaotic map for point cloud, polygons, and textures, while we use less time. Besides, our method can resist more method of attacks such as statistic attack, brute-force attack, and correlation attack.

    DOI: 10.1007/978-3-030-17763-8_10

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112803527&origin=inward

  • Research into the Adaptability Evaluation of the Remote Sensing Image Fusion Method Based on Nearest-Neighbor Diffusion Pan Sharpening 査読有り

    Wang C., Shao W., Lu H., Zhang H., Wang S., Yue H.

    EAI/Springer Innovations in Communication and Computing   33 - 39   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Nearest-neighbor diffusion pan sharpening, as a new image fusion method based on nearest-neighbor diffusion, has become a new hot spot of research. In this paper, the nearest-neighbor diffusion pan sharpening method is used for a WorldView-2 image fusion experiment and compared with the methods we usually use such as the wavelet transform fusion method, the PCA transform fusion method, and the Gram–Schmidt transform fusion method. The experimental results show that the spatial information is better than the other three methods in terms of spatial details and texture.

    DOI: 10.1007/978-3-030-17763-8_4

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85134288923&origin=inward

  • Weighted Linear Multiple Kernel Learning for Saliency Detection 査読有り

    Zhou Q., Wu J., Fan Y., Zhang S., Wu X., Zheng B., Jin X., Lu H., Latecki L.J.

    EAI/Springer Innovations in Communication and Computing   201 - 213   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    This paper presents a novel saliency detection method based on weighted linear multiple kernel learning (WLMKL), which is able to adaptively combine different contrast measurements in a supervised manner. Three commonly used bottom-up visual saliency operations are first introduced, including corner-surround contrast (CSC), center-surround contrast (CESC), and global contrast (GC). Then these contrast measures are fed into our WLMKL framework to produce the final saliency map. We show that the assigned weights for each contrast feature maps are always normalized in our WLMKL formulation. In addition, the proposed approach benefits from the advantages of the contribution of each individual contrast operation, and thus produces more robust and accurate saliency maps. The extensive experimental results show the effectiveness of the proposed model, and demonstrate the combination is superior to individual subcomponent.

    DOI: 10.1007/978-3-030-17763-8_19

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85113232085&origin=inward

  • YOLOv3 を用いた全天球カメラ映像からの障害物認識 査読有り

    甲斐 友博, 陸 慧敏, 神谷 亨

    バイオメディカル・ファジィ・システム学会大会講演論文集 ( バイオメディカル・ファジィ・システム学会 )   33 ( 0 )   84 - 88   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p><i>The attention has been focused in using assistive devices at the aging era of Japan. One of the devices is electric wheelchair, which enables physical disability people to easily operate it. However, accidents are occurring frequently with increasing demand by using electric wheelchair. Therefore, the development of an autonomous driving electric wheelchair is required to reduce accidents. In this paper, we propose a recognition of obstacles of panoramic images that obtained from a spherical camera. A spherical camera is equipped in an electric wheelchair, and images are cut out from the sequential images obtained by running. For obstacles recognition, we use YOLOv3. The proposed method considers the distortion of the image caused by using the spherical camera. The improvement of the model of YOLOv3 is examined, and the validity with the actual data is verified. </i></p>

    DOI: 10.24466/pacbfsa.33.0_84

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007980050

  • Automatic extraction of abnormalities on temporal ct subtraction images using sparse coding and 3d-cnn 査読有り

    Koizumi Y., Miyake N., Lu H., Kim H., Aoki T., Kido S.

    Proceedings of International Conference on Artificial Life and Robotics   2020   783 - 786   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    In recent years, the rate of death from cancer has tended to increase in Japan. Also, there is a concern that increasing the performance of CT will increase the burden on doctors. Therefore, by presenting a "second opinion" in the CAD system, the burden on doctors can be reduced. We develop a CAD system for automatic detection of lung cancer. In this paper we propose a method to detect abnormalities based on temporal subtraction technique, sparse coding and 3D-CNN. We obtain the result that sparse level contributed most to the score.

    DOI: 10.5954/ICAROB.2020.GS3-3

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85108848194&origin=inward

  • Image Segmentation with Language Referring Expression and Comprehension 査読有り

    Sun J., Li Y., Cai J., Lu H., Serikawa S.

    IEEE Sensors Journal   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    Image segmentation with language referring expression can complete object segmentation based on expression text. Existing image segmentation methods show good results on high-performance computers, but most robot systems need real-time and high accuracy. At present, most methods cannot meet these requirements well. Therefore, we propose a high-precision and real-time deep learning network that integrates the two tasks of image segmentation with language referring expression and referring expression comprehension and then treats them as two branches. Specifically, the proposed network first merges the two tasks. The feature maps of different scales extracted by each branch are fed back to the two branches to obtain prediction results. These two tasks promote and restrict each other. Experiments show that our method has better real-time performance and higher accuracy than existing methods.

    DOI: 10.1109/JSEN.2020.3041046

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85097962167&origin=inward

  • 時間-周波数解析と畳み込みニューラルネットワークを用いた呼吸音の自動分類 査読有り

    南 弘毅, 陸 慧敏, 金 亨燮, 平野 靖, 間普 真吾, 木戸 尚治

    Medical Imaging Technology ( 日本医用画像工学会 )   38 ( 1 )   40 - 47   2020年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>呼吸器疾患の診断方法としては,聴診器を用いた呼吸音の聴診が長年用いられてきた.これは簡便で安全な診断方法である一方,聴診音の診断には定量的な評価基準がないため,医師の診断支援を行うシステムの開発が必要である.そこで本論文では,畳み込みニューラルネットワーク(CNN: convolutional neural network)を用いた呼吸音の自動分類手法の提案を行う.おもな手法の流れとしては,呼吸音データに対して短時間フーリエ変換と連続ウェーブレット変換を適用し,スペクトログラム画像およびスカログラム画像を生成する.その後,生成した画像を用いてCNN による正常呼吸音,連続性ラ音,断続性ラ音の識別を行う.提案手法を呼吸音データ22 症例に適用した結果,分類性能として,全体正解率は 79.44[%],ROC(receiver operating characteristic)曲線に基づくAUC(area under the curve)は0.942 を得た.</p>

    DOI: 10.11409/mit.38.40

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007796400

  • A 6-DOF telexistence drone controlled by a head mounted display 査読有り

    Xia X., Pun C.M., Zhang D., Yang Y., Lu H., Gao H., Xu F.

    26th IEEE Conference on Virtual Reality and 3D User Interfaces, VR 2019 - Proceedings   1241 - 1242   2019年03月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Recently, a new form of telexistence is achieved by recording images with cameras on an unmanned aerial vehicle (UAV) and displaying them to the user via a head mounted display (HMD). A key problem here is how to provide a free and natural mechanism for the user to control the viewpoint and watch a scene. To this end, we propose an improved rate-control method with an adaptive origin update (AOU) scheme. Without the aid of any auxiliary equipment, our scheme handles the self-centering problem. In addition, we present a full 6-DOF viewpoint control method to manipulate the motion of a stereo camera, and we build a real prototype to realize this by utilizing a pan-tilt-zoom (PTZ) which not only provides 2-DOF to the camera but also compensates the jittering motion of the UAV to record more stable image streams.

    DOI: 10.1109/VR.2019.8797791

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85071888370&origin=inward

  • Introduction to the special section on Artificial Intelligence and Computer Vision 査読有り

    Lu H., Guna J.

    Computers and Electrical Engineering   73   378 - 379   2019年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    DOI: 10.1016/j.compeleceng.2018.12.003

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060661799&origin=inward

  • 3D-CNN による経時的差分像上の結節状陰影検出 査読有り

    芳野 由利子, 陸 慧敏, 金 亨燮, 村上 誠一, 青木 隆敏, 木戸 尚治

    医用画像情報学会雑誌 ( 医用画像情報学会 )   36 ( 2 )   77 - 82   2019年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>A temporal subtraction image is obtained by subtracting a previous image, which are warped to match between the structures of the previous image and one of a current image, from the current image. The temporal subtraction technique removes normal structures and enhances interval changes such as new lesions and changes of existing abnormalities from a medical image. However, many artifacts remain on a temporal subtraction image and these can be detected as false positives on the subtraction images. In this paper, we propose a 3D-CNN after initial nodule candidates are detected using temporal subtraction technique. To compare the proposed 3D-CNN, we used 7 model architectures, which are 3D ShallowNet, 3D-AlexNet, 3D-VGG11, 3D-VGG13, 3D-ResNet8, 3D-ResNet20, 3D-ResNet32, with these performance on 28 thoracic MDCT cases including 28 small-sized lung nodules. The higher performance is showed on 3D-AlexNet.</p>

    DOI: 10.11318/mii.36.77

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007668727

  • CNNを用いた指骨CR画像からの骨粗しょう症の自動識別 査読有り

    畠野 和裕, 村上 誠一, 植村 知規, 陸 慧敏, 金 亨燮, 青木 隆敏

    Medical Imaging Technology ( 日本医用画像工学会 )   37 ( 2 )   107 - 115   2019年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>骨のおもな疾患として,骨粗しょう症が挙げられる.骨粗しょう症に対する画像診断は有効であるが,医師の負担増加や経験差による診断結果のばらつき,病変部の未検出等が懸念されている.そこで本稿では,指骨computed radiography(CR)画像から骨粗しょう症の識別を行い,医師に提示するための診断支援手法を提案する.提案手法では,畳み込みニューラルネットワークの一種である,Residual Network (ResNet)を用いた識別器を構築し,骨粗しょう症有無の識別を行う.ResNetへの入力画像には,CR画像から生成した画像を用いる.本稿では,3種類の入力画像を提案し,各画像で学習および,識別の評価を行う.実験では,101症例に対し提案手法を適用し,receiver operating characteristics(ROC)曲線上のarea under the curve(AUC)値を用いて評価したところ,最大で0.931という結果を得た.</p>

    DOI: 10.11409/mit.37.107

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007633096

  • 畳み込みニューラルネットワークを用いた指骨CR画像からの骨粗しょう症の識別 査読有り

    畠野 和裕, 村上 誠一, 植村 知規, 陸 慧敏, 金 亨燮, 青木 隆敏

    医用画像情報学会雑誌 ( 医用画像情報学会 )   36 ( 2 )   72 - 76   2019年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Osteoporosis is known as one of the main diseases of bone. Although image diagnosis for osteoporosis is effective, there are concerns about increased burden of radiologists associated with diagnostic imaging, uneven diagnostic results due to experience difference, and undetected lesions. Therefore, in this study, we propose a diagnosis supporting method for classifying osteoporosis from phalanges computed radiography images and presenting classification results to physicians. In the proposed method, we construct classifiers using convolution neural network and classify normal cases and abnormal cases about osteoporosis. In our experiments, two kinds of CNN models were constructed using input images generated from 101 cases of CR images and evaluated using Area Under the Curve(AUC)value on Receiver Operating Characteristics(ROC)curve. Finaly, AUC of 0.995 was obtained.</p>

    DOI: 10.11318/mii.36.72

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007668726

  • Swallowing motion analyzing from dental MR imaging based on AKAZE and particle filter algorithm 査読有り

    Suetani K., Lu H., Tan J., Kim H., Tanaka T., Kitou S., Morimoto Y.

    International Conference on Control, Automation and Systems   2018-October   1343 - 1346   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. In recent years, dysphagia is problem among elderly people. Therefore, it is necessary to accurately evaluate swallowing function in order to prevent swallowing disorder beforehand or to detect it early. And it is considered that evaluation of swallowing function using Magnetic Resonance Imaging (MRI) is useful. In order to accurately analyzing of the swallowing motion using a computer aided diagnosis (CAD) system on MR imaging, automatic extraction of the esophagus region, which is a region of interest by the image analysis method, is required. Extraction of the spinal region is required as a preliminary step of the esophagus region extraction. Therefore, in this paper, we develop an analysis method of swallowing movement by three steps of extraction of spinal region, extraction of esophageal region, and analysis of swallowing movement. As an analytical method of swallowing movement, we emphasize the liquid part at the time of swallowing movement using an emphasis map, then follow the liquid tracing by using the AKAZE feature quantity and the particle filter algorithm, and analyze the swallowing motion.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060441189&origin=inward

  • ROI-based fully automated liver registration in multi-phase CT images 査読有り

    Saito K., Lu H., Kim H., Kido S., Tanabe M.

    International Conference on Control, Automation and Systems   2018-October   645 - 649   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. In this paper, we propose a registration method for fully automated liver tumor detection. Multiple phases CT is used for the detection of the liver tumor because multiple phase CT can give different characteristic features of lesions for each time phases. Registration accuracy is important when obtaining image features from multiple time phases. However, since each time phases have different image density characteristics, therefore registration of multi-phase CT is a challenging task. In this paper, we propose a robust initial alignment method independent of changing image density features in each time phase, and deformable registration method with region of interests (ROI) as liver region extracted by U-Net. Our proposed method is evaluated on 15 patient image sets. This method is applied to the early arterial phase and the equilibrium phase to registries. Experimental results show that segmentation of early arterial phase is 83% and registration is 93% accuracy.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060464031&origin=inward

  • Extraction of median plane from facial 3D point cloud based on symmetry analysis using ICP algorithm 査読有り

    Yamada S., Lu H., Tan J., Kim H., Kimura N., Okawachi T., Nozoe E., Nakamura N.

    International Conference on Control, Automation and Systems   2018-October   1347 - 1350   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. Cleft lip is a kind of congenital facial morphological abnormality. In the clinical field of cleft lip, it is necessary to analyze symmetric shape. However, there is no method to analyze the cleft lip technique based on symmetrical viewpoints. On the other hand, in our previous method to find a symmetric axis using a 2D image, since the middle line is extracted only from the front view of the face moire image. There was a problem that low accuracy was obtained by slight rotation of the face and it was not possible to consider 3D information. In this paper, we propose a method to extract the median plane of the face by analyzing based on bilateral symmetry by using 3D point cloud on the face of front. By extracting the median plane, we believe that not only surgical assistance of doctor be possible but also become a clue to development of simulation software which is the end goal.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060498535&origin=inward

  • Detection of abnormal shadows on temporal subtraction images based on multi-phase CNN 査読有り

    Nagao M., Miyake N., Yoshino Y., Lu H., Kim H., Murakami S., Aoki T., Kido S.

    International Conference on Control, Automation and Systems   2018-October   1333 - 1337   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. Recently, visual screening based on CT images become useful tools in the medical fields. However, due to the large number of images and the complexity of the image processing algorithms, image processing technique for the high screening quality is still required. To overcome this problem, some computer aided diagnosis (CAD) algorithms are proposed. Cancer is a leading cause of death both in Japan and worldwide. Detection of cancer region in CT images is the most important task to early detection and early treatment. We have designed and developed a framework combining machine learning based on multi-phase convolutional neural networks (CNN) and temporal subtraction techniques based on non-rigid image registration algorithm. Our main classification method can be built into three main steps; i) preprocessing for image segmentation, ii) image matching for registration, and iii) classification of abnormal regions based on machine learning algorithms. We performed our proposed technique to 25 thoracic MDCT sets and obtained true positive rates of 93.55%, false positive rates of 10.93 /case.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060434756&origin=inward

  • Detection of grasping position from video images based on SSD 査読有り

    Kitayama T., Lu H., Li Y., Kim H.

    International Conference on Control, Automation and Systems   2018-October   1472 - 1475   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. Recently, consistent container transportation of roads and ships is mainstream of international freight transport. Because of various factors, automation of cargo handling work is required at the container terminal. Various causes are decrement of future labor force population by an increasing trend of container moving amount and declining birthrate and aging population. Therefore, this study presents the relative position of hanger and container measurement technology using Single Shot Multibox Detector (SSD) for the purpose of improvement of cargo handling work efficiency and unmanned container terminal. In the case of undetected by SSD, it will be detected using AKAZE feature. The proposed method is applied to 368 images of container gripping taken by a camera installed in a container crane. As a result, Interaction of Union (IoU) targeted for container gripping is 87.79%, and a detection rate is 94.57%.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060451358&origin=inward

  • Detection of phalange region based on U-Net 査読有り

    Hatano K., Murakami S., Lu H., Tan J., Kim H., Aoki T.

    International Conference on Control, Automation and Systems   2018-October   1338 - 1342   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. Osteoporosis is one of the famous bone diseases. It is a major cause of deteriorating the quality of life, and early detection and early treatment are becoming socially important. Visual screening using Computed Radiography (CR) images is effective for diagnosis of osteoporosis, but there are problems of increasing the burden on doctors, variation in diagnostic results due to differences in experiences of doctors, and undetected lesions. In order to solve this problem, we are working on a computer-aided diagnosis (CAD) system for osteoporosis. In this paper, we propose segmentation methods of the phalange region from the phalangeal CR images as a preprocessing of classification of osteoporosis. In the proposed method, we construct a segmentation model using U-Net, which is a type of deep convolution neural network (DCNN). The proposed method was applied to input images generated from CR images of 101 patients with both hands, and evaluated using the Intersection over Union (IoU) values. The result was 0.914 in IoU.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060468143&origin=inward

  • Enhancement of bone metastasis from CT images based on salient region feature registration 査読有り

    Sato S., Lu H., Kim H., Murakami S., Ueno M., Terasawa T., Aoki T.

    International Conference on Control, Automation and Systems   2018-October   1329 - 1332   2018年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © ICROS. In recent years, the development of the computer-aided diagnosis (CAD) systems to support radiologist is attracting attention in medical research field. One of them is temporal subtraction technique. It is a technique to generate images emphasizing temporal changes in lesions by performing a differential operation between current and previous image of the same subject. In this paper, we propose an image registration method for image registration of current and previous image, to generate temporal subtraction images from CT images and enhanced bone metastasis region. The proposed registration method is composed into three main steps: i) automatic segmentation of the region of interest (ROI) using position information of the spine based on biology, ii) use global image matching to select pairs from previous and current image, and iii) final image matching based on salient region feature. We perform registration technique on synthetic data and confirm usefulness of the proposed method. Furthermore, radiologist conduct comparative experiments without and with temporal subtraction images created by proposed method. As a result, they show high reading performance by using temporal subtraction images.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060475609&origin=inward

  • Segmentation of Spinal Canal Region in CT Images using 3D Region Growing Technique 査読有り

    Fu G., Lu H., Tan J., Kim H., Zhu X., Lu J.

    2018 International Conference on Information and Communication Technology Robotics, ICT-ROBOT 2018   2018年11月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © 2018 IEEE. Tumor of spinal cord (Spinal Cord Neoplasms (SCN)) is considered as one of the life threatening diseases that causes death. Early detection of the SCN plays an important role in the management of the lesions. To analyze the treatment, it is necessary to segment the spinal canal based on accurate three-dimensional image processing technique. This paper presents a segmentation algorithm based on 3D region growing for extracting spinal canal from CT images with high accuracy. Intersection over union (IoU) is used to compare the results of segmentation with the manual segmentation results. In the experiment, the proposed method was tested on 3373 CT slices of 10 patients. The proposed method has an average accuracy of 0.7732 and a variance of 0.0061. Satisfactory results have been achieved rapidly, which demonstrates the effectiveness and superiority of the proposed method.

    DOI: 10.1109/ICT-ROBOT.2018.8549913

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060055563&origin=inward

  • Object Detection on Video Images Based on R-FCN and GrowCut Algorithm 査読有り

    Mouri K., Lu H., Tan J., Kim H.

    2018 International Conference on Information and Communication Technology Robotics, ICT-ROBOT 2018   2018年11月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © 2018 IEEE. Since the declining birthrate and the aging of society, there is concern about the labor shortage in Japan. There is a movement to compensate for the labor shortage by automation of factories by robots. Automation technique is wildly promoted in logistics industry, while there is few studies in objects picking. To solve this issue, we develop an image detection scheme for robotics picking from a video image. It is difficult to recognize and grasp different types of objects in robot vision field. Therefore, in the proposed method, object detection and object recognition method are proposed using Region-based Fully Convolutional Networks that is a type of object detection using deep learning. After detecting the object individually, final target object can select by applying the GrowCut algorithm. As a result, we achieve 0.6773 of the average precision and 0.6395 of Intersection over Union as the segmentation result respecively.

    DOI: 10.1109/ICT-ROBOT.2018.8549879

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060010049&origin=inward

  • Dual Learning for Visual Question Generation 査読有り

    Xu X., Song J., Lu H., He L., Yang Y., Shen F.

    Proceedings - IEEE International Conference on Multimedia and Expo   2018-July   2018年10月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © 2018 IEEE. Recently, automatic answering of visually related questions (VQA) has gained a lot of attention in computer vision community. However, there is little work on automatically generating questions for images (VQG). Actually, VQG itself closes the loop to question-answering and diverse questions, which is useful to the research on VQA. Motivated by the assumption that learning to answer questions may boost the question generation, in this paper, we introduce the VQA task as the complementary of our primary VQG task, and propose a novel model that uses dual learning framework to jointly learn the dual tasks. In the framework, we devise an agent for VQG and VQA with pre-trained models respectively, and the learning tasks of the two agents form a closed loop, whose objectives are optimized together to guide each other via a reinforcement learning process. Specific rewards for each task are designed to update the models of the agents with policy gradient method. The relation of these two tasks can be exploited to further improve the performance of the primary VQG task. Extensive experiments conducted on two large-scale datasets show that the proposed method is capable to generate grounded visual questions of sufficient coverage and outperforms previous VQG methods on standard measures.

    DOI: 10.1109/ICME.2018.8486475

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85061432991&origin=inward

  • Statistical shape model generation using K-means clustering 査読有り

    Wu J., Li G., Lu H., Kim H.

    ACM International Conference Proceeding Series   207 - 211   2018年09月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    © 2018 Association for Computing Machinery. Statistical shape models (SSMs) is a robust and efficient method in medical image segmentation. In this paper, a novel landmark corresponding method based on k-means clustering and demons registration is proposed to train a 3-D statistical shape model with higher quality. The k-means clustering method is performed on the original geometric surface to obtain a simplified surface as standard set of landmarks to find correspondent landmarks on each mapped spherical surface obtained from demon registration in the training set. Twenty cases of left lung and right lung regions in thoracic MDCT images are used in the experiment to build two SSMs. Performance evaluation results show that SSMs generated by the proposed method achieve better generalization ability and specificity while maintaining the same compactness and accuracy of segmentation as those reported by state-of-the-art methods.

    DOI: 10.1145/3277453.3277467

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85060643606&origin=inward

  • FDCNet: filtering deep convolutional network for marine organism classification 査読有り

    Lu H., Li Y., Uemura T., Ge Z., Xu X., He L., Serikawa S., Kim H.

    Multimedia Tools and Applications   77 ( 17 )   21847 - 21860   2018年09月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    © 2017, Springer Science+Business Media New York. Convolutional networks are currently the most popular computer vision methods for a wide variety of applications in multimedia research fields. Most recent methods have focused on solving problems with natural images and usually use a training database, such as Imagenet or Openimage, to detect the characteristics of the objects. However, in practical applications, training samples are difficult to acquire. In this study, we develop a powerful approach that can accurately learn marine organisms. The proposed filtering deep convolutional network (FDCNet) classifies deep-sea objects better than state-of-the-art classification methods, such as AlexNet, GoogLeNet, ResNet50, and ResNet101. The classification accuracy of the proposed FDCNet method is 1.8%, 2.9%, 2.0%, and 1.0% better than AlexNet, GooLeNet, ResNet50, and ResNet101, respectively. In addition, we have built the first marine organism database, Kyutech10K, with seven categories (i.e., shrimp, squid, crab, shark, sea urchin, manganese, and sand).

    DOI: 10.1007/s11042-017-4585-1

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85015699675&origin=inward

  • Editorial: Intelligent Industrial IoT Integration with Cognitive Computing 査読有り

    Zhang Y., Peng L., Sun Y., Lu H.

    Mobile Networks and Applications   23 ( 2 )   185 - 187   2018年04月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    DOI: 10.1007/s11036-017-0939-1

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85046319298&origin=inward

  • Editorial: Artificial Intelligence for Mobile Robotic Networks 査読有り

    Lu H., He L., Zhou Q., Ge Z.

    Mobile Networks and Applications   23 ( 2 )   326 - 327   2018年04月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    DOI: 10.1007/s11036-017-0938-2

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85030681214&origin=inward

  • Proposal of a power-saving unmanned aerial vehicle 査読有り

    Lu H., Li Y., Guna J., Serikawa S.

    EAI International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities (TRIDENTCOM)   2017-September   2018年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Copyright © 2018 EAI. In recent years, unmanned aerial vehicle (UAV) technologies are developing rapidly. Drone, one type of the UAVs, is used in many industrial fields, such as photography, delivery and agriculture. However, the commercial drone can flying only about 20 minutes at one charge. Furthermore, the drone prohibits flying at the limited area, and it also can’t work in bad weather. Due to the development of drone technologies, we must reduce energy consumption, and realize high range movement. In order to solve these limitations, we develop a new type of drone, which has the function of flight and vehicle can move less power consumption. It extends high range of mobility to drone. Moreover, it can be used to pass through the limitation area or bad weather condition by sliding.

    DOI: 10.4108/eai.28-9-2017.2273334

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85051034970&origin=inward

  • R-FCNとGrowCutを用いたボールペンの検出 査読有り

    毛利 浩介, 陸 慧敏, 金 享燮

    産業応用工学会全国大会講演論文集 ( 一般社団法人 産業応用工学会 )   2018 ( 0 )   85 - 86   2018年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    現在,日本では少子高齢化が進み,生産年齢人口の減少による労働力不足が懸念されている.生産年齢人口は戦後増加の一途をたどり,1995年には8,726万人に達したが,2015年には7,728万人に減少しており,2065年には4,529万人まで減少し,日本の全人口の51.4%になると予測されている(1).このような,生産年齢人口の減少による労働力不足を,ロボットによる工場等の自動化で補う動きがある.そこで,本論文ではロボットを用いたピッキング作業の自動化を行うための対象物体の検出・認識,物体領域の抽出法の開発を行う.

    DOI: 10.12792/iiae2018.044

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130007542637

  • DCNNによる指骨CR画像からの骨粗しょう症の自動識別 査読有り

    畠野 和裕, 村上 誠一, 植村 知規, 陸 慧敏, タン ジュークイ, 金 亨燮, 青木 隆敏

    Medical Imaging Technology ( 日本医用画像工学会 )   36 ( 2 )   90 - 95   2018年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    骨のおもな疾患として,骨粗しょう症が挙げられる.骨粗しょう症に対する画像診断は有効であるが,類似した低骨量を呈する画像も多く,画像診断における客観性や再現性の問題がある.そこで本稿では,指骨computed radiography(CR)画像から骨粗しょう症の自動識別手法を提案する.提案手法では,深層畳み込みニューラルネットワーク(DCNN)を用いた識別器を構築し,骨粗しょう症有無の識別を行う.DCNNの学習および識別には,CR画像から3種類の画像を作成し,各指骨領域内部からROIを抽出後,この3種類のROIをR,G,Bチャンネルに割り当て生成した疑似カラー画像を用いる.実験では,101症例に対し提案手法を適用し,真陽性率(TPR):75.5[%],偽陽性率(FPR):13.9[%]という結果を得た.

    DOI: 10.11409/mit.36.90

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006588793

  • Introduction to the special section on Artificial Intelligence and Computer Vision

    Lu H., Guna J., Dansereau D.

    Computers and Electrical Engineering ( Computers and Electrical Engineering )   58   444 - 446   2017年02月

     詳細を見る

    記述言語:日本語   掲載種別:記事・総説・解説・論説等(大学・研究所紀要)

    DOI: 10.1016/j.compeleceng.2017.04.024

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85018365801&origin=inward

  • 転移深層学習畳み込みニューラルネットワークを用いたCTコロノグラフィ候補陰影からのポリープ分類法 査読有り

    植村 知規, 陸 慧敏, 金 亨燮, 橘 理恵, 弘 中亨, Janne J. Näppi, 吉田 広行

    医用画像情報学会雑誌 ( 医用画像情報学会 )   34 ( 2 )   80 - 86   2017年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Computed tomographic colonography(CTC), also known as virtual colonoscopy, provides a minimally invasive screening method for early detection of colorectal lesions. It can be used to solve the problems of accuracy, capacity, cost,and safety that have been associated with conventional colorectal screening methods. Computer-aided detection(CADe)has been shown to increase radiologists' sensitivity and to reduce inter-observer variance in detecting colonic polyps in CTC. However, although CADe systems can prompt locations of abnormalities at a higher sensitivity than that of radiologists,they also prompt relatively large numbers of false positives(FPs). In this study, we developed and evaluated the effect of a transfer-learning deep convolutional neural network(TL-DCNN)on the classification of polyp candidates detected by a CADe system from dual-energy CTC images. A deep convolution neural network(DCNN)that had been pre-trained with millions of natural non-medical images was fine-tuned to identify polyps by use of pseudo-colored images that were generated by assigning axial, coronal, and sagittal images of the polyp candidates to the red, green, and blue channels of the images, respectively. The classification performances of the TL-DCNN and the corresponding non-transfer-learning DCNN were evaluated by use of 5-fold cross validation on 20 clinical CTC cases. The TL-DCNN yielded true- and falsepositive rates of 73.6[%]and 1.79[%], respectively, which were significantly higher than those of the non-transferlearning DCNN. This preliminary result demonstrates the effectiveness of the TL-DCNN in the classification of polyp candidates from CTC images.</p>

    DOI: 10.11318/mii.34.80

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006846731

  • DCNNによるLIDCデータからのすりガラス状陰影の検出 査読有り

    平山 一希, 陸 慧敏, タン ジュークイ, 金 亨燮, 橘 理恵, 平野 靖, 木戸 尚治

    医用画像情報学会雑誌 ( 医用画像情報学会 )   34 ( 2 )   70 - 74   2017年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Lung cancer is one of the most important cancer in the world. Among them, Ground Glass Opacity(GGO)has a hazy area of increased attenuation in the lung image. In recent years, development of a Computer Aided Diagnosis (CAD)system for reducing the burden on work load and improving the detection rate of lesions has been advanced. In this paper, we propose a CAD system to extract GGO from CT images. Firstly, we extract the lung region from the input CT images and remove the vessel, and bronchial region based on 3 D line filter algorithm. After that, we extract initial GGO regions using concentration and gradient information. Next, we calculate the statistical features on the segmented regions. After that, we classify GGO regions using support vector machine(SVM). Finally, we detect the final GGO regions using deep convolutional neural network(DCNN). The proposed method is tested on 31 cases of CT images from the Lung Image Database Consortium(LIDC). The results demonstrate that the proposed method has 86.05[%] of true positive rate and 39.03[/case] of false positive number.</p>

    DOI: 10.11318/mii.34.70

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006846732

  • ドローンモーター異常の知的感知による速度制御 査読有り

    木原 圭太, 陸 慧敏, 楊 世淵, 芹川 聖一

    産業応用工学会全国大会講演論文集 ( 一般社団法人 産業応用工学会 )   2017 ( 0 )   33 - 34   2017年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    コンピュータ制御によって自律飛行する無人航空機はドローンと呼ばれ、軍事用,天候観察,農薬散布など幅広く使用されている.しかし,ドローンには落下事故というリスクがある.本研究では,落下事故に繋がる要因の一つであるモーター温度による故障に着目した.本研究では,一つ目にモーターにとって異常とされる温度下での使用を防ぐことを目的として異常検知システムを構築する.この対策システムの実装により失敗が許されない環境での使用や,事故が多い手動操作から自動運転への転換が活発になると考えられる.また二つ目にモーターの温度上昇度の急激な増加への対策を目的として,モーターの温度上昇度によってドローンの速度を調整するシステムを実装する.このシステムによって,容易にモーター温度が上昇しないようにする.このアルゴリズムは従来の異常検知のアルゴリズム[1]とは異なり,異常と判断するための閾値をリアルタイムで動的に変化することができる.

    DOI: 10.12792/iiae2017.018

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006688858

  • 細胞領域の論理積を用いた蛍光顕微鏡画像からの血中循環がん細胞の自動検出 査読有り

    辻 幸喜, 陸 慧敏, タン ジュークイ, 金 亨燮, 米田 和恵, 田中 文啓

    医用画像情報学会雑誌 ( 医用画像情報学会 )   34 ( 4 )   151 - 155   2017年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    <p>Circulating tumor cells(CTCs)can be a useful biomarker. They may have some information about the malignant disease, since they are one of causes of the cancer metastasis. The blood sample from cancer patient is analyzed by fluorescence microscope. This microscope takes enlarged images with three types of lights(red, green and blue),and specific materials are reacted respectively. The blood contains a lot of cells, but there are few CTCs. Therefore analyzing them is not easy work for pathologists. In this study, we develop a method which detects circulating tumor cells in fluorescence microscopy images automatically. Our proposed method has three steps. First, we extract cell regions in microscopy images by using filtering processing. Second, we separate the connecting cell regions into single cell regions,based on the branch and bound algorithm. Finally, we identify CTCs by using logical conjunction method. We demonstrated the effectiveness of our proposed method using 6 cases(5040 microscopy images), and we evaluated the performance of CTCs identification. Our proposed method achieved, a true positive rate of 95.27 [%] and a false positive rate of 6.172 [%] respectively. And we confirmed the effectiveness of the logical conjinction for CTCs identicication.</p>

    DOI: 10.11318/mii.34.151

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006267703

  • 海底地形三次元復元のためデプスマップの超解像度化 査読有り

    陸 慧敏, 柏尾 洋平, 古賀 陽介, 李 玉潔, 中島 翔太, 張 力峰, Jože Guna, 芹川 聖一

    産業応用工学会全国大会講演論文集 ( 一般社団法人 産業応用工学会 )   2015 ( 0 )   60 - 61   2015年01月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    これまで,水中の地形を復元する方法にはソナーを用いた方法などが主な方法として採用されてきた。ソナーを用いた方法は遠距離から撮影が行えるという利点がある。しかしながら,近距離の撮影には向いておらず採鉱機械の周辺を監視する装置が必要とされてきた。そこで,近距離を撮影可能なKinectの撮影画像から三次元復元を行えることに着目し,高解像度の三次元復元を行うため深度画像の超解像度化を行った。超解像を行う際に水中に存在する泥の影響も考慮して研究を行った。その結果,インペインティング処理,ヘイズ除去処理と超解像処理を利用し,より高精度の距離情報が得られた。

    DOI: 10.12792/iiae2015.031

    CiNii Article

    その他リンク: https://ci.nii.ac.jp/naid/130006688800

  • A novel safety light curtain system using a hemispherical mirror 査読有り

    Kenjo Y., Suzue R., Lu H., Miyata K., Yang S., Serikawa S.

    Proceedings of SPIE - The International Society for Optical Engineering   8561   2012年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(国際会議プロシーディングス)

    Light curtain systems are used to detect intruders in various cases and places. However, it is necessary to adjust the position of the light detecting element accurately in order to receive the irradiating laser light. We propose a new type safety light curtain system that uses a hemispherical mirror and an LED in this research. A hemispherical mirror can reflect irradiating light rays surroundings of 180° in the vertical direction and 360° in the horizontal direction. When an LED is at a position that is higher than the hemispherical mirror, the LED irradiating light can be reflected by the hemispherical mirror, even if the LED is arbitrarily set up. In the case that, the light of LED is intercepted when an intruder passes between the LED and the hemispherical mirror, the output voltage of the light detecting element decreases. We can set a proper threshold voltage value of the detecting element to judge whether an intruder passes or not. Our system uses a PSOC microcomputer to judge the output voltage of the receiving element with the threshold voltage value. In addition, the LED output light is modulated by 10kHz in order to avoid the influence of the surrounding turbulence light. Our experiment succeeded to detect intruder using the proposed system without accurate light axis setting. © 2012 SPIE.

    DOI: 10.1117/12.999722

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84879867921&origin=inward

  • Enhancing underwater image by dehazing and colorization 査読有り

    Lu H., Lu H., Li Y., Zhang L., Serikawa S.

    International Review on Computers and Software   7 ( 7 )   3470 - 3474   2012年12月

     詳細を見る

    記述言語:英語   掲載種別:研究論文(学術雑誌)

    This paper describes a novel method to enhance underwater images by dehazing and colorization. Scattering and color change are two major problems of distortion for underwater imaging. Scattering is caused by large suspended particles, like fog or turbid water. Color change corresponds to the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering ambient underwater environments dominated by a bluish tone. Our key contribution is to propose a single image dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artificial lighting source into consideration. Next, the water depth in the image scene is estimated. A simple colorization method is utilized for restore color balance. The enhanced images are characterized by reduced noised level, better exposedness of the dark regions, improved global contrast while the finest details and edges are enhanced significantly. In addition, our enhancement method is comparable to higher quality than the state of the art methods. © 2012 Praise Worthy Prize S.r.l.

    Scopus

    その他リンク: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84875151728&origin=inward

▼全件表示

著書

  • Cognitive Internet of Things: Frameworks, Tools and Applications

    Huimin LU(単著)

    Springer International Publishing  2020年01月  ( ISBN:978-3-030-04945-4

     詳細を見る

    記述言語:英語

  • Artificial Inteligence and Robotics

    Huimin Lu, Xing Xu(共編者(共編著者))

    Springer  2017年12月  ( ISBN:978-3-319-69877-9

     詳細を見る

    記述言語:日本語

  • Artificial Intelligence and Computer Vision

    Huimin Lu, Yujie Li(共編者(共編著者))

    Springer  2016年11月 

     詳細を見る

    記述言語:英語

講演

  • AIを活用した水中画像処理技術と深海資源調査への展開

    第1回海中海底工学フォーラム・ZERO  2019年04月 

     詳細を見る

    講演種別:招待講演  

  • Artificial Intelligence in Deep-sea Observing

    The 2nd International Symposium on Artificial Intelligence and Robotics 2017  2017年11月  ISAIR

     詳細を見る

    講演種別:招待講演   開催地:Kitakyushu, Japan  

  • Extreme Optical Imaging for Deep-sea Observing Network

    26th International Electrotechnical and Computer Science Conference ERK 2017  2017年09月  IEEE Slovenia

     詳細を見る

    講演種別:招待講演   開催地:Congress Center Bernardin, Portorož, Slovenia  

  • Next Generation Artificial Intelligence in Society 5.0

    IBM Australia Seminar  2017年09月  IBM Australia

     詳細を見る

    講演種別:座談会   開催地:Melbourne, Australia  

科研費獲得実績

  • 日中超スマート社会の実現に向けた次世代のAI/IoTに関する研究

    研究課題番号:00000001  2018年04月 - 2019年03月   二国間国際交流事業

  • 深海採鉱機採削時の画像計測システムの研究開発

    研究課題番号:17K14694  2017年04月 - 2019年03月   若手研究(B)

  • 深海採鉱機向けリアルタイム小型イメージングシステムの研究開発

    研究課題番号:15F15077  2015年04月 - 2016年09月   特別研究員奨励費

  • 深海採鉱機向け鉱床計測用リアルタイム画像採取処理装置の研究開発

    研究課題番号:13J10713  2013年04月 - 2015年03月   特別研究員奨励費

受託研究・共同研究実施実績

  • 国立情報学研究所共同研究

    2018年04月 - 2019年03月

     詳細を見る

    研究区分:受託研究

  • 国立研究開発法人情報通信研究機構国際交流プログラム

    2018年04月 - 2019年03月

     詳細を見る

    研究区分:受託研究

寄附金・講座

  • 公益財団法人電気通信普及財団 研究調査助成  公益財団法人電気通信普及財団  2019年05月

  • 電気通信普及財団研究調査助成  公益財団法人電気通信普及財団  2018年05月

  • 造船学術研究推進機構 助成金  造船学術研究推進機構  2017年08月

  • 電気通信普及財団 研究調査助成  公益財団法人電気通信普及財団  2017年04月

担当授業科目(学内)

  • 2023年度   実践工学総合科目C

  • 2023年度   ロボットビジョン特論

  • 2023年度   メカトロニクス

  • 2023年度   知能制御応用

  • 2023年度   ロボット制御工学

  • 2023年度   プロセス制御

  • 2022年度   ロボットビジョン特論

  • 2022年度   実践工学総合科目C

  • 2022年度   メカトロニクス

  • 2022年度   ロボット制御工学

  • 2022年度   知能制御応用

  • 2021年度   ロボットビジョン特論

  • 2021年度   メカトロニクス

  • 2021年度   知能制御応用

  • 2021年度   ロボット制御工学

  • 2021年度   機械知能工学入門

  • 2020年度   確率システム制御特論

  • 2020年度   メカトロニクス

  • 2020年度   知能制御応用

  • 2020年度   ロボット制御工学

▼全件表示

学会・委員会等活動

  • IEEE Computer Society Big Data Special Technical Committee   共同委員長  

    2019年08月 - 現在

国際会議開催(学会主催除く)

  • EAI International Conference on Robotic Sensor Networks

    2017年11月25日 - 2017年11月26日

  • The 2nd International Symposium on Artificial Intelligence and Robotics 2017

    2017年11月25日 - 2017年11月26日

  • The 1st International Symposium on Artificial Intelligence and Robotics 2016

    Huimin Lu  China  2016年12月13日

国際交流窓口担当

  • リュブリャナ大学  スロベニア共和国  2018年11月 - 現在

  • 南京郵電大学 オートメーション工学部  中華人民共和国  2018年05月 - 現在