Chairs: Jean-Christophe Burie & Pau Riba

Jean-Christophe Burie - Opening Session 

Timothy Hospedales - Keynote talk: Free-Hand Sketch Analysis 

#OST-1 - Automatic generation of semi-structured documents by Djedjiga Belhadj, Yolande Belaïd and Abdel Belaïd

#OST-2 - DocVisor: A Multi-purpose Web-based Interactive Visualizer for Document Image Analytics by Khadiravana Belagavi, Pranav Tadimeti and Ravi Kiran Sarvadevabhatla

#OST-1 - Automatic generation of semi-structured documents by Djedjiga Belhadj, Yolande Belaïd and Abdel Belaïd

#OST-2 - DocVisor: A Multi-purpose Web-based Interactive Visualizer for Document Image Analytics by Khadiravana Belagavi, Pranav Tadimeti and Ravi Kiran Sarvadevabhatla

Chairs: Bidyut Baran Chaudhur, Amir Hussain, Imran Razzak, Adel M. Alimi, Fadoua Drira & Tarek M. Hamdani

#ASAR-3 - High Performance Urdu and Arabic Video Text Recognition using Convolutional Recurrent Neural Networks by Abdul Rehman, Adnan Ul-Hasan and Faisal Shafait

#ASAR-4 - AOLAH Databases for New Arabic Online Handwriting Recognition Algorithm by Samia Heshmat and Mohamed Abdelnafea

#ASAR-5 - Improving Handwritten Arabic Text Recognition Using an Adaptive Data-Augmentation Algorithm by Mohamed Eltay, Abdelmalek Zidouri, Irfan Ahmad and Yousef Elarian

#ASAR-6 - Towards Boosting the Accuracy of Non-Latin Scene Text Recognition by Sanjana Gunna, Rohit Saluja and C. V. Jawahar

#ASAR-7 - ASAR 2021 Online Arabic Writer Identification Competition by Thameur Dhieb

#ASAR-8 - ASAR 2021 Competition on Online Signal Restoration using Arabic Handwriting Dhad Dataset by Besma Rabhi

#ASAR-9 - ASAR 2021 Competition on Online Arabic Character Recognition: ACRC by Yahia Hamdi

#ASAR-10 - ASAR 2021 Competition on Online Arabic Word Recognition by Hanen Akouaydi

Closing

 

Opening

Lianwen Jin - Keynote: Visual Information Extraction for Document Understanding

Lei Cui - Keynote: Document AI: Benchmarks, Models and Applications

#DIL-12 - A Span Extraction Approach for Information Extraction on Visually-Rich Documents by Tuan-Anh D. Nguyen, Hieu M. Vu, Nguyen Hong Son and Minh-Tien Nguyen

 

Chairs: Dimos, Ruben and Minesh

Minesh - Welcome

Minesh and Ruben - DocVQA Competition Session Introduction

Dawid Jurkiewicz- DocVQA Task3 Winner - Applica AI: Text-Image-Layout Transformer approach to Visual Question Answering

Ryota Tanaka - DocVQA Task3 Runner up- IG-BERT: Learning Text-Icon-Layout Representations and Arithmetic Operations for Infographic Understanding

Jianglong He - DocVQA Task2 Winner- Infrrd:RADAR (Retrieval of Answers by Document Analysis and Re-ranking)

Brian Price - Invited talk: Understanding Data Visualizations via Question Answering

Chair: Maud Ehrmann

#HIP-8 - Generalized Template Matching for Semi-structured Text by George Nagy

#HIP-9 - BiblIA - a general model for Medieval Hebrew manuscripts and an open annotated dataset by Daniel Stoekl Ben Ezra, Bronson Brown-DeVost, Pawel Jablonski, Hayim Lapin, Benjamin Kiessling and Elena Lolli

#HIP-6 - Visual Analysis of Chapbooks Printed in Scotland by Abhishek Dutta, Giles Bergel and Andrew Zisserman

Heiko Maus - Keynote: Using Knowledge Graphs for Document Analysis Scenarios in Corporate Memories

#DIL-9 - MTL-FoUn: A Multi-Task Learning Approach to Form Understanding by Nishant Prabhu, Hiteshi Jain and Abhishek Tripathi

#DIL-6 - Recurrent Neural Network Transducer for Japanese and Chinese Offline Handwritten Text Recognition by Trung Tan Ngo, Hung Tuan Nguyen, Nam Tuan Ly and Masaki Nakagawa

#DIL-3 - VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach by KERROUMI Mohamed, SAYEM Othmane and SHABOU Aymen

#DIL-5 - Multi-task Learning for Newspaper Image Segmentation and Baseline Detection Using Attention-Based U-Net Architecture by Anukriti Bansal, Prerana Mukherjee, Divyansh Joshi, Devashish Tripathi and Arun Pratap Singh 

#DIL-7 - A Transformer-based Math Language Model for Handwritten Math Expression Recognition by Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen, Thanh-Nghia Truong and Masaki Nakagawa

#DIL-8 - Data-Efficient Information Extraction from Documents with Pre-Trained Language Models by Clément Sage, Thibault Douzon, Alex Aussem, Véronique Eglin, Haytham Elghazel, Stefan Duffner, Christophe Garcia and Jérémy Espinas

#DIL-10 - Exploring Out-of-Distribution Generalization in Text Classifiers Trained on Tobacco-3482 and RVL-CDIP by Stefan Larson, Navtej Singh, Saarthak Maheshwari, Shanti Stewart and Uma Krishnaswamy

#DIL-11 - Labeling Document Images for E-Commence Products with Tree-Based Segment Re-organizing and Hierarchical Transformer by Peng Li, Pingguan Yuan, Yong Li, Yongjun Bao and Weipeng Yan

Best paper award and closing

 

Chairs: Dimos, Ruben and Minesh

Amanpreet Singh - Invited talk: Towards models that can read and reason about scene text

Yijuan Lu - Invited talk: Scene Text-Aware Pre-training for Text-VQA and Text-Caption

Brian Price, Anand Mishra,  David Doermann and Filip Graliński - Panel Discussion

Dimos - Closing

FDAR: welcome and keynote talks

Chairs: Isabelle Marthot-Santaniello & Hussein A. Mohammed

Introduction by the Organizers

Peter Stokes - First keynote talk: Multiple Disciplines, Multiple Dimensions: Towards a more holistic study of written objects

Sebastian Bosch - Second keynote talk:The Instrumental Analytics laboratory at the Centre for the Study of Manuscript Cultures (CSMC)

Giuliano Giuffrida - A first analysis of the digital corpus of the Vatican Apostolic Library

Paul Dilley - The Potential of Digital Paleography for the Medinet Madi Coptic Manichaean Corpus

Simona Stoyanova, Jonathan Prag - Integrating palaeographic research into the digital epigraphy of multilingual Sicily

Simon Castellan, Vincent Cohen-Addad, Sophie Giffard-Roisin, Ester Salgarella - Writers Behind Words: Detecting Scribal Variation in Linear A Inscriptions

Simon Gabay, Jean-Baptiste Camps, Claire Jahan, Ariane Pinche - SegmOnto: common vocabulary and practices for analysing the layout of manuscripts (and more)

General discussion

FDAR: working groups and summary of discussions

Chairs: Isabelle Marthot-Santaniello & Hussein A. Mohammed

Chahan Vidal-Gorène and Aliénor Decours-Perez - Computational Approach of Armenian Paleography

Jean-Baptiste Camps, Chahan Vidal-Gorène and Marguerite Vernet - Handling Heavily Abbreviated Manuscripts: HTR engines vs text normalisation approaches

Daniel Stoekl Ben Ezra, Pawel Jablonski and Bronson Brown-DeVost - Exploiting Insertion Symbols for Marginal Additions in the Recognition Process to Establish Reading Order

Nikita Srivatsan, Jason Vega, Christina Skelton and Taylor Berg-Kirkpatrick - Neural Representation Learning for Scribal Hands of Linear B

Olga Serbaeva Saraogi and Stephen White - READ for solving manuscript riddles: a preliminary study of the manuscripts of the 3rd ṣaṭka of the Jayadrathayāmala

General discussion

 

Chair: Jianwen Jin

Keynote - Xiaojun Chang

Chairs: Nibal Nayef & Jean-Christophe Burie

Opening session and Introduction of the Doctoral Consortium

Brief presentation (teasers) of the projects by the PhD students

Daniel Lopresti - Talk “How to succeed in your Ph.D. degree” 

#DC-1 - Computerised Image Processing for Handwritten Text Recognition of Historical Manuscripts by Raphaela Heil

#DC-2 - Text based Visual Question Answering by Rubèn Pérez Tito 

#DC-3 -Information Theoretical Approach To Understand Deep Neural Networks by Christoph Zaugg

#DC-4 - Automatic and model-free learning of semantic-structural links of fields in a document by Ibrahim Souleiman Mahamoud

#DC-5 - Towards an Explainable Deep Model for Archival Document Image Segmentation by Iheb Brini

#DC-6 - Segmentation, Recognition and Indexing of characters in CHAM documents by Tien Nam Nguyen

#DC-7 - Towards semantic understanding of scientific papers by Francesco Lombardi

#DC-8 - Font Design Analysis: Understanding Designers' Knowledge by Using Machine Learning by Daichi Haraguchi

#DC-9 - Structural Analysis and Understanding of Complex Layouts in Document Images by Sanket Biswas

#DC-10 - Automatic recognition of historical handwritten parish records by Solène Tarride

#DC-11 - Automated Summarization of Legal Judgements by Sahar Arshad

#DC-12 - Information Extraction from Legal Documents Based on Natural Language Processing (NLP) by Iqra Basharat

#DC-13 - Weakly-Supervised Scene Text Detection by Mengbiao Zhao

#DC-14 - Optical Handwritten Named Entity Recognition by Thomas Constum

#DC-15 - Multivalent Graph Matching and Ant Colony Optimization for Pattern Recognition by Kieu-Diem Ho

#DC-16 - Deep Neural Networks and Attention Mechanisms for Handwritten Text Recognition by Killian Barrere

This tutorial will present a complete pipeline of digitalisation applied to historical newspapers, from the digitisation of documents to ways to access them with high level semantics within a digital library demonstrator. The focus on historical newspapers is a particularly relevant use case as they are unique and detailed records of events, with numerous points of view. Yet, they were by nature not printed to be preserved; they were produced in large quantities, with the intent to be used on a regular basis and replaced by their next issue. This implies production on rather cheap paper and ink, with conservation being a very secondary concern. This tutorial will detail the challenges in digitisation, OCR, layout analysis, article separation, robust-to-noise and language-independent semantic enrichment, up to the indexation and working with large collections of newspapers in multiple languages coming from multiple sources. The tutorial will rely on the document analysis, recognition and understanding pipeline of the H2020 NewsEye project, as well as its newspaper analysis platform.

https://www.newseye.eu/icdar2021/

Have you ever wondered how machines understand Natural Language? How “Google Translate” works? How “Siri”, a robotic voice, responds to your voice commands? Or how a piece of software understands a text document and does automatic summarization or extract relevant sentences? The answer to all these questions lie in this workshop where we explore the astounding domain of Natural Language Processing (NLP). We will unveil the very concepts of NLP with the help of your notion of how you understand the natural human language. In this tutorial, we will present you the nitty-gritties and the pipeline process of NLP with a real world example of text summarization. The tutorial also includes interesting exercises to gain intuition in the NLP domain and a code walk through for text summarization task. Summarization models can be trained and applied across a range of domains and with diverse applications and can save an immense amount of time reading a large-content document.

https://www.infocusp.in/icdar_2021.html

Chairs: Nibal Nayef & Jean-Christophe Burie

Session and Discussion with PhD Students

Concluding remarks and Best Poster Award

#DC-1 - Computerised Image Processing for Handwritten Text Recognition of Historical Manuscripts by Raphaela Heil

#DC-2 - Text based Visual Question Answering by Rubèn Pérez Tito 

#DC-3 -Information Theoretical Approach To Understand Deep Neural Networks by Christoph Zaugg

#DC-4 - Automatic and model-free learning of semantic-structural links of fields in a document by Ibrahim Souleiman Mahamoud

#DC-5 - Towards an Explainable Deep Model for Archival Document Image Segmentation by Iheb Brini

#DC-6 - Segmentation, Recognition and Indexing of characters in CHAM documents by Tien Nam Nguyen

#DC-7 - Towards semantic understanding of scientific papers by Francesco Lombardi

#DC-8 - Font Design Analysis: Understanding Designers' Knowledge by Using Machine Learning by Daichi Haraguchi

#DC-9 - Structural Analysis and Understanding of Complex Layouts in Document Images by Sanket Biswas

#DC-10 - Automatic recognition of historical handwritten parish records by Solène Tarride

#DC-11 - Automated Summarization of Legal Judgements by Sahar Arshad

#DC-12 - Information Extraction from Legal Documents Based on Natural Language Processing (NLP) by Iqra Basharat

#DC-13 - Weakly-Supervised Scene Text Detection by Mengbiao Zhao

#DC-14 - Optical Handwritten Named Entity Recognition by Thomas Constum

#DC-15 - Multivalent Graph Matching and Ant Colony Optimization for Pattern Recognition by Kieu-Diem Ho

#DC-16 - Deep Neural Networks and Attention Mechanisms for Handwritten Text Recognition by Killian Barrere

 

This tutorial will present a complete pipeline of digitalisation applied to historical newspapers, from the digitisation of documents to ways to access them with high level semantics within a digital library demonstrator. The focus on historical newspapers is a particularly relevant use case as they are unique and detailed records of events, with numerous points of view. Yet, they were by nature not printed to be preserved; they were produced in large quantities, with the intent to be used on a regular basis and replaced by their next issue. This implies production on rather cheap paper and ink, with conservation being a very secondary concern. This tutorial will detail the challenges in digitisation, OCR, layout analysis, article separation, robust-to-noise and language-independent semantic enrichment, up to the indexation and working with large collections of newspapers in multiple languages coming from multiple sources. The tutorial will rely on the document analysis, recognition and understanding pipeline of the H2020 NewsEye project, as well as its newspaper analysis platform.

https://www.newseye.eu/icdar2021/

Have you ever wondered how machines understand Natural Language? How “Google Translate” works? How “Siri”, a robotic voice, responds to your voice commands? Or how a piece of software understands a text document and does automatic summarization or extract relevant sentences? The answer to all these questions lie in this workshop where we explore the astounding domain of Natural Language Processing (NLP). We will unveil the very concepts of NLP with the help of your notion of how you understand the natural human language. In this tutorial, we will present you the nitty-gritties and the pipeline process of NLP with a real world example of text summarization. The tutorial also includes interesting exercises to gain intuition in the NLP domain and a code walk through for text summarization task. Summarization models can be trained and applied across a range of domains and with diverse applications and can save an immense amount of time reading a large-content document.

https://www.infocusp.in/icdar_2021.html

 

Welcome to ICDAR - conference Chairs and surprise invited talk

Rolf Ingold "Welcome to Lausanne, Switzerland"

Invited Talk - surprise

Andreas Fischer "Practical information onsite Beaulieu"

Daniel Lopresti "Submissions & Reviews"

Marcus Liwicki "Practical information online" 

Toward automatic recognition and scoring of handwritten descriptive answers

Chair: Andreas Dengel

Starting from the brief history of offline and online handwriting recognition, I will talk about my experiences of joint projects with companies, which might be useful for the audience. Then I will present the latest challenge to automate scoring of handwritten answers for descriptive questions. Descriptive questions can test deep understanding and problem-solving ability of examinees much better than selection-type questions asked by most of CBTs and encourage examinees to think rather than select. Full-automatic recognition and scoring of descriptive answers provides immediate feedback to examinees to review their answers when examinees can confirm scoring, while semi-automatic, or computer assisted scoring, provides reliable scoring when examinees cannot confirm scoring. Both decrease time and effort for examiners or teachers to score exams. My dream is to unify online recognition of handwritten answers from tablets and offline recognition from scanners except for several early-stage layers in DNN. The same DNN architecture may learn to recognize Japanese, English, and Math answers. The DNN for handwritten answer recognition will output reliable features to cluster answers for semi-automatic scoring. The DNN for handwriting recognition could be even merged with that for automatic scoring and trained end-to-end. An initial attempt for Japanese language questions for 120,000 examinees shows a promising result.

Document processing, OCR and their impact

Chair: Faisal Shafait

OCR are a common step to process documents and extract their content. Even if their performances are better and better, errors remain and may impact the further steps. Some solutions will be proposed to deal with them. 

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

We will take a group picture at the beginning of this session (virtual attendees should attend)

The yearly meeting of the TC10 and TC11 committee with reports, updates, discussions, and deciding for the venue of future ICDARs; Everyone is invited, and should participate! .

For online attendees: If you like to join some of our presentations, please join the virtual ICDAR lobby zoom room. Presentations are planned to be around 20:30 (In Memorium Guy Lorette), 21:00 (Thanks), and 21:30 (future events)

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

 

P1-59 - #22 - Full Page Handwriting Recognition via Image to Sequence Extraction by Sumeet S. Singh and Sergey Karayev

P1-4 - #23 - Can Text Summarization Enhance the Headline Stance Detection Task? Benefits and Drawbacks by Marta Vicente, Robiert Sepúlveda-Torrres, Cristina Barros, Estela Saquete and Elena Lloret

P1-69 - #29 - Text-line-up: Don’t Worry about the Caret by Chandranath Adak, Bidyut B. Chaudhuri, Chin-Teng Lin and Michael Blumenstein

P1-44 - #37 - Research on pseudo-label technology for multi-label news classification by Lianxi Wang, Xiaotian Lin and Nankai Lin

P1-16 - #39 - Context-Free TextSpotter for Real-Time and Mobile End-to-End Text Detection and Recognition by Ryota Yoshihashi, Tomohiro Tanaka, Kenji Doi, Takumi Fujino and Naoaki Yamashita

P1-62 - #48 - Sequence Learning Model for Syllables Recognition Arranged in Two Dimensions by Valerii Dziubliuk, Mykhailo Zlotnyk and Oleksandr Viatchaninov

P1-37 - #50 - Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer by Wenqi Zhao, Liangcai Gao, Zuoyu Yan, Shuai Peng, Lin Du and Ziyin Zhang

P1-1 - #51 - Towards Document Panoptic Segmentation with Pinpoint Accuracy: Method and Evaluation by Rongyu Cao, Hongwei Li, Ganbin Zhou and Ping Luo

P1-23 - #56 - Sparse Document Analysis using Beta-Liouville Naive Bayes with Vocabulary Knowledge by Fatma Najar and Nizar Bouguila

P1-63 - #59 - Transformer for Handwritten Text Recognition using Bidirectional Post-Decoding by Christoph Wick, Jochen Zöllner and Tobias Grüning

P1-64 - #60 - Zero-Shot Chinese Text Recognition via Matching Class Embedding by Yuhao Huang, Lianwen Jin and Dezhi Peng

P1-65 - #67 - Text-conditioned Character Segmentation for CTC-based Text Recognition by Ryohei Tanaka, Kunio Osada and Akio Furuhata

P1-60 - #76 - SPAN: a Simple Predict & Align Network for Handwritten Paragraph Recognition by Denis Coquenet, Clément Chatelain and Thierry Paquet

P1-51 - #83 - Dialogue Act Recognition using Visual Information by Jiří Martínek, Pavel Král and Ladislav Lenc

P1-52 - #95 - Are End-to-End Systems Really Necessary for NER on Handwritten Document Images? by Oliver Tüselmann, Fabian Wolf and Gernot A. Fink

P1-41 - #96 - Image-based Relation Classification Approach for Table Structure Recognition by Koji Ichikawa

P1-35 - #108 - Rethinking Table Structure Recognition Using Sequence Labeling Methods by Yibo Li, Yilun Huang, Ziyi Zhu, Lemeng Pan, Yongshuai Huang, Lin Du, Zhi Tang and Liangcai Gao

P1-9 - #113 - Label Selection Algorithm Based on Boolean Interpolative Decomposition with Sequential Backward Selection for Multi-label Classification by Tianqi Ji, Jun Li and Jianhua Xu

P1-42 - #115 - Image to LaTeX with Graph Neural Network for Mathematical Formula Recognition by Shuai Peng, Liangcai Gao, Ke Yuan and Zhi Tang

P1-6 - #116 - CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-Document Summarization by Sayali Kulkarni, Sheide Chammas, Wan Zhu, Fei Sha and Eugene Ie

P1-36 - #119 - TabLeX: A Benchmark Dataset for Structure and Content Information Extraction from Scientific Tables by Harsh Desai, Pratik Kayal and Mayank Singh

P1-31 - #129 - Palmira: A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts by S P Sharan, Sowmya Aitha, Amandeep Kumar, Abhishek Trivedi, Aaron Augustine and Ravi Kiran Sarvadevabhatla

P1-28 - #131 - Probabilistic Indexing and Search for Hyphenated Words by Enrique Vidal and Alejandro H. Toselli

P1-66 - #141 - Towards Fast, Accurate and Compact Online Handwritten Chinese Text Recognition by Dezhi Peng, Canyu Xie, Hongliang Li, Lianwen Jin, Zecheng Xie, Kai Ding, Yichao Huang and Yaqiang Wu

P1-7 - #143 - RTNet: An End-to-End Method for Handwritten Text Image Translation by Tonghua Su, Shuchen Liu and Shengjie Zhou

P1-10 - #145 - GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers by Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen and Masaki Nakagawa

P1-26 - #146 - A-VLAD: An End-to-End Attention-based Neural Network for Writer Identification in Historical Documents by Trung Tan Ngo, Hung Tuan Nguyen and Masaki Nakagawa

P1-32 - #149 - Page Layout Analysis System for Unconstrained Historic Documents by Oldřich Kodym and Michal Hradiš

P1-55- #157 - Consideration of the word’s neighborhood in GATs for information extraction in semi-structured documents by Djedjiga Belhadj, Yolande Belaïd and Abdel Belaïd

P1-24 - #158 - Automatic Signature-based Writer Identification in Mixed-script Scenarios by Sk Md Obaidullah, Mridul Ghosh, Himadri Mukherjee, Kaushik Roy and Umapada Pal

P1-53 - #160 - Training Bi-Encoders for Word Sense Disambiguation by Harsh Kohli

P1-61 - #163 - IHR-NomDB: The Old Degraded Vietnamese Handwritten Script Archive Database by Manh Tu VU, Van Linh LE, and Marie BEURTON-AIMAR

P1-27 - #172 - Manga-MMTL: multimodal multitask transfer learning for manga character analysis by Nhu-Van Nguyen, Christophe Rigaud, Arnaud Revel and Jean-Christophe Burie

P1-29 - #175 - SandSlide: Automatic Slideshow Normalization by Sieben Bocklandt, Gust Verbruggen and Thomas Winters

P1-3 - #185 - Toward Automatic Interpretation of 3D Plots by Laura E. Brandt and William T. Freeman

P1-8 - #195 - NTable: A Dataset for Camera-based Table Detection by Ziyi Zhu, Liangcai Gao, Yibo Li, Yilun Huang, Lin Du, Ning Lu and Xianfeng Wang

P1-18 - #196 - Determining optimal frame processing strategies for real-time document recognition systems by Konstantin Bulatov and Vladimir V. Arlazarov

P1-15 - #201 - Dynamic Receptive Field Adaptation for Attention-Based Text Recognition by Haibo Qin, Chun Yang, Xiaobin Zhu and Xucheng Yin

P1-12 - #206 - LSTMVAEF: Vivid Layout via LSTM-based Variational Autoencoder Framework by Jie He, Xingjiao Wu, Wenxin Hu and Jing Yang

P1-13 - #211 - HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification by Andrii Grygoriev, Illya Degtyarenko, Ivan Deriuga, Serhii Polotskyi, Volodymyr Melnyk, Dmytro Zakharchuk and Olga Radyvonenko

P1-70 - #219 - Multimodal Attention-based Learning for Imbalanced Corporate Documents Classification by Ibrahim Souleiman Mahamoud, Joris Voerman, Mickaël Coustaty, Aurélie Joseph, Vincent Poulain d'Andecy and Jean-Marc Ogier

P1-19 - #226 - Embedded Attributes for Cuneiform Sign Spotting by Eugen Rusakov, Turna Somel, Gerfrid G.W. Müller and Gernot A. Fink

P1-54 - #234 - DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction by Freddy C. Chua and Nigel P. Duffy

P1-33 - #235 - Improved Graph Methods for Table Layout Understanding by Jose Ramón Prieto and Enrique Vidal

P1-49 - #236 - Temporal Ordering of Events via Deep Neural Networks by Nafaa Haffar, Rami Ayadi, Emna Hkiri and Mounir Zrigui

P1-2 - #237 - A Math Formula Extraction and Evaluation Framework for PDF Documents by Ayush Kumar Shah, Abhisek Dey and Richard Zanibbi

P1-5 - #242 - The Biased Coin Flip Process for Nonparametric Topic Modeling by Justin Wood, Wei Wang and Corey Arnold

P1-67 - #246 - HCADecoder: A Hybrid CTC-Attention Decoder for Chinese Text Recognition by Siqi Cai, Wenyuan Xue, Qingyong Li and Peng Zhao

P1-48 - #248 - Data Centric Domain Adaptation for Historical Text with OCR Errors by Luisa März, Stefan Schweter, Nina Poerner, Benjamin Roth and Hinrich Schütze

P1-34 - #251 - Unsupervised learning of text line segmentation by differentiating coarse patterns by Berat Kurar Barakat, Ahmad Droby, Raid Saabni and Jihad El-Sana

P1-45 - #254 - Information Extraction from Invoices by Ahmed Hamdi, Elodie Carel, Aurélie Joseph, Mickael Coustaty and Antoine Doucet

P1-46 - #256 - Are You Really Complaining? A Multi-task Framework for Complaint Identification, Emotion and Sentiment Classification by Apoorva Singh and Sriparna Saha

P1-39 - #257 - An Encoder-Decoder Approach to Handwritten Mathematical Expression Recognition with Multi-Head Attention and Stacked Decoder by Haisong Ding, Kai Chen and Qiang Huo

P1-21 - #259 - Two-Step Fine-Tuned Convolutional Neural Networks for Multi-Label Classification of Children's Drawings by Muhammad Osama Zeeshan, Imran Siddiqi and Momina Moetesum

P1-40 - #271 - Global Context for improving recognition of Online Handwritten Mathematical Expressions by Cuong Tuan Nguyen, Thanh-Nghia Truong, Hung Tuan Nguyen and Masaki Nakagawa

P1-50 - #272 - Document Collection Visual Question Answering by Rubèn Tito, Dimosthenis Karatzas and Ernest Valveny

P1-71 - #273 - Light-weight Document Image Cleanup using Perceptual Loss by Soumyadeep Dey and Pratik Jawanpuria

P1-22 - #306 - DCINN: Deformable Convolution and Inception Based Neural Network for Tattoo Text Detection through Skin Region by Tamal Chowdhury, Palaiahnakote Shivakumara, Umapada Pal, Tong Lu, Ramachandra Raghavendra and Sukalpa Chanda

P1-43 - #316 - A Novel Method for Automated Suggestion of Similar Software Incidents using 2-Stage Filtering : Findings on Primary Data by Badal Agrawal, Mohit Mishra and Varun Parashar

P1-47 - #318 - Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer by Rafał Powalski, Łukasz Borchmann, Dawid Jurkiewicz, Tomasz Dwojak, Michał Pietruszka and Gabriela Pałka

P1-20 - #324 - Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach by Adrià Molina, Pau Riba, Lluis Gomez, Oriol Ramos-Terrades and Josep Lladós

P1-11 - #325 - C2VNet: A Deep Learning Framework Towards Comic Strip to Audio-Visual Scene Synthesis by Vaibhavi Gupta, Vinay Detani, Vivek Khokar and Chiranjoy Chattopadhyay

P1-25 - #327 - Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting by Pau Riba, Adrià Molina, Lluis Gomez, Oriol Ramos-Terrades and Josep Lladós

P1-14 - #328 - RFDoc: memory efficient local descriptors for ID documents localization and classification by Daniil Matalov, Elena Limonova, Natalya Skoryukina and Vladimir V. Arlazarov

P1-57 - #334 - Deep Learning for Document Layout Generation: A First Reproducible Quantitative Evaluation and a Baseline Model by Romain Carletto, Hubert Cardot and Nicolas Ragot

P1-38 - #338 - TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition by Umar Khan, Sohaib Zahid, Muhammad Asad Ali, Adnan Ul-Hasan and Faisal Shafait

P1-58 - #342 - MRD: A Memory Relation Decoder for Online Handwritten Mathematical Expression Recognition by Jiaming Wang, Qing Wang, Jun Du, Jianshu Zhang, Bin Wang and Bo Ren

P1-17 - #348 - MIDV-LAIT: a challenging dataset for recognition of IDs with Perso-Arabic, Thai, and Indian scripts by Yulia Chernyshova, Ekaterina Emelianova, Alexander Sheshkus and Vladimir V. Arlazarov

P1-68 - #350 - Meta-learning of Pooling Layers for Character Recognition by Takato Otsuzuki, Heon Song, Seiichi Uchida and Hideaki Hayashi

P1-30 - #366 - Digital Editions as Distant Supervision for Layout Analysis of Printed Books by Alejandro H. Toselli, Si Wu and David A. Smith

P1-56 - #369 - MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning by Paola A., Buitrago, Evgeny Toropov, Rajanie Prabha, Julian Uran and Raja Adal

#Comp-ST_01 - ICDAR 2021 Competition on Scene Video Text Spotting by Zhanzhan Cheng, Jing Lu, Baorui Zou, Shuigeng Zhou, Fei Wu

#Comp-ST_02 - ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment by Chun Chet Ng, Akmalul Khairi Bin Nazaruddin, Yeong Khang Lee, Xinyu Wang, Yuliang Liu, Chee Seng Chan, Lianwen Jin, Yipeng Sun, Lixin Fan

#Comp-ST_03 - ICDAR 2021 Competition on Components Segmentation Task of Document Photos by Celso A. M. Lopes Junior, Ricardo B. Neves Junior, Byron L. D. Bezerra, Alejandro H. Toselli, Donato Impedovo

#Comp-ST_04 - ICDAR 2021 Competition on Historical Map Segmentation by Joseph Chazalon, Edwin Carlinet, Yizi Chen, Julien Perret, Bertrand Duménieu, Clément Mallet, Thierry Géraud,Vincent Nguyen, Nam Nguyen, Josef Baloun, Ladislav Lenc, Pavel Král

#Comp-ST_05 - ICDAR 2021 Competition on Time-Quality Document Image Binarization by Rafael Dueire Lins, Rodrigo Barros Bernardino, Elisa Barney Smith, Ergina Kavallieratou

#Comp-ST_06 - ICDAR 2021 Competition on On-Line Signature Verification by Ruben Tolosana, Ruben Vera-Rodriguez, Carlos Gonzalez-Garcia, Julian Fierrez, Santiago Rengifo, Aythami Morales, Javier Ortega-Garcia, Juan Carlos Ruiz-Garcia, Sergio Romero-Tapiador, Jiajia Jiang, Songxuan Lai, Lianwen Jin, Yecheng Zhu, Javier Galbally, Moises Diaz, Miguel Angel Ferrer, Marta Gomez-Barrero, Ilya Hodashinsky, Konstantin Sarin, Artem Slezkin, Marina Bardamova, Mikhail Svetlakov, Mohammad Saleem, Cintia Lia Szücs, Bence Kovari, Falk Pulsmeyer, Mohamad Wehbi, Dario Zanca, Sumaiya Ahmad, Sarthak Mishra, Suraiya Jabin

#Comp-ST_07 - ICDAR 2021 Competition on Script Identification in the Wild by Abhijit Das, Miguel A. Ferrer, Aythami Morales, Moises Diaz, Umapada Pal, Donato Impedovo

#Comp-ST_08 - ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX by Pratik Kayal, Mrinal Anand, Harsh Desai, Mayank Singh

#Comp-ST_09 - ICDAR 2021 Competition on Multimodal Emotion Recognition on Comics Scenes by Nhu-Van Nguyen, Xuan-Son Vu, Christophe Rigaud, Lili Jiang, Jean-Christophe Burie

#Comp-ST_10 - ICDAR 2021 Competition on Mathematical Formula Detection by Dan Anitei, Joan Andreu Sánchez, José Manuel Fuentes, Roberto Paredes, José Miguel Benedı́

 

 

 

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

P2-15 - #8 - Near-perfect Relation Extraction from Family Books by George Nagy

P2-9 - #24 - Density Parameters of Handwriting in Schizophrenia and Affective Disorders Assessed Using the Raygraf Computer Software by Barbara Gawda

P2-48 - #28 - A More Effective Sentence-Wise Text Segmentation Approach using BERT by Amit Maraj, Miguel Vargas Martin and Masoud Makrehchi

P2-47 - #30 - Open Set Authorship Attribution toward Demystifying Victorian Periodicals by Sarkhan Badirli, Mary Borgo Ton, Abdulmecit Gungor and Murat Dundar

P2-44 - #34 - FEDS - Filtered Edit Distance Surrogate by Yash Patel and Jiří Matas

P2-35 - #43 - Mask Scene Text Recognizer by Haodong Shi, Liangrui Peng, Ruijie Yan, Gang Yao, Shuman Han and Shengjin Wang

P2-22 - #45 - Bayesian Hyperparameter optimization of Deep Neural Network algorithms based on Ant Colony optimization by Sinda Jlassi, Imen Jdey and Hela Ltifi

P2-6 - #46 - Attention based Multiple Siamese Network for Offline Signature Verification by Yu-Jie Xiong and Song-Yang Cheng

P2-23 - #49 - End-to-End Approach for Recognition of Historical Digit Strings by Mengqiao Zhao, Andre Gustavo Hochuli and Abbas Cheddad

P2-36 - #53 - Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection by Jusung Lee, Jaemyung Lee, Cheoljong Yang, Younghyun Lee and Joonsoo Lee

P2-14 - #55 - A Novel Sigma-Lognormal Parameter Extractor for Online Signatures by Jianhuan Huang and Zili Zhang

P2-28 - #61 - Improving Machine Understanding of Human Intent in Charts by Sihang Wu, Canyu Xie, Yuhao Huang, Guozhi Tang, Qianying Liao, Jiapeng Wang, Bangdong Chen, Hongliang Li, Xinfeng Chang, Hui Li, Kai Ding, Yichao Huang and Lianwen Jin

P2-2 - #63 - Searching from the Prediction of Visual and Language Model for Handwritten Chinese Text Recognition by Brian Liu, Weicong Sun, Wenjing Kang and Xianchao Xu

P2-54 - #70 - EDNets: Deep Feature Learning for Document Image Classification based on Multi-view Encoder-Decoder Neural Networks by Akrem Sellami and Salvatore Tabbone

P2-65 - #71 - Handwriting Recognition with Novelty by Derek S. Prijatelj, Samuel Grieggs, Futoshi Yumoto, Eric Robertson and Walter J. Scheirer

P2-49 - #73 - Data Augmentation for Writer Identification Using a Cognitive Inspired Model by Fabio Pignelli, Yandre M. G. Costa, Luiz S. Oliveira and Diego Bertolini

P2-59 - #78 - GNHK: A Dataset for English Handwriting in the Wild by Alex W. C. Lee, Jonathan Chung and Marco Lee

P2-30 - #86 - Sequential Next-Symbol Prediction for Optical Music Recognition by Enrique Mas-Candela, Maria Alfaro-Contreras and Jorge Calvo-Zaragoza

P2-37 - #88 - Heterogeneous Network Based Semi-supervised Learning For Scene Text Recognition by Qianyi Jiang, Qi Song, Nan Li, Rui Zhang and Xiaolin Wei

P2-38 - #90 - Scene Text Detection with Scribble Line by Wenqing Zhang, Yang Qiu, Minghui Liao, Rui Zhang, Xiaolin Wei and Xiang Bai

P2-25 - #91 - Synthesizing Training Data for Handwritten Music Recognition by Jiří Mayer and Pavel Pecina

P2-39 - #104 - EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition by Jiedong Hao, Yafei Wen, Jie Deng, Jun Gan, Shuai Ren, Hui Tan, and Xiaoxin Chen

P2-29 - #105 - DeMatch: Towards Understanding the Panel of Chart Documents by Hesuo Zhang, Weihong Ma, Lianwen Jin, Yichao Huang, Kai Ding and Yaqiang Wu

P2-17 - #112 - A Modular and Automated Annotation Platform for Handwritings: Evaluation on Under-resourced Languages by Chahan Vidal-Gorène, Boris Dupin, Aliénor Decours-Perez and Thomas Riccioli

P2-18 - #122 - Reducing the Human Effort in Text Line Segmentation for Historical Documents by Emilio Granell, Lorenzo Quirós, Verónica Romero and Joan Andreu Sánchez

P2-53 - #124 - Gender Detection Based on Spatial Pyramid Matching by Fahimeh Alaei and Alireza Alaei

P2-69 - #126 - Recognizing Handwritten Chinese Texts with Insertion and Swapping Using A Structural Attention Network by Shi Yan, Jin-Wen Wu, Fei Yin and Cheng-Lin Liu

P2-3 - #127 - Towards an IMU-based Pen Online Handwriting Recognizer by Mohamad Wehbi, Tim Hamann, Jens Barth, Peter Kaempf, Dario Zanca and Bjoern Eskofier

P2-4 - #132 - 2D vs 3D online writer identification: a comparative study by Antonio Parziale, Cristina Carmona-Duarte, Miguel Angel Ferrer and Angelo Marcelli

P2-10 - #138 - Language-Independent Bimodal System for Early Parkinson’s Disease Detection by Catherine Taleb, Laurence Likforman-Sulem and Chafic Mokbel

P2-5 - #139 - A Handwritten Signature Segmentation Approach for Multi-resolution and Complex Documents Acquired by Multiple Sources by Celso A. M. Lopes Junior, Murilo C. Stodolni, Byron L. D. Bezerra and Donato Impedovo

P2-21 - #140 - Font Style that Fits an Image -- Font Generation Based on Image Context by Taiga Miyazono, Brian Kenji Iwana, Daichi Haraguchi and Seiichi Uchida

P2-67 - #148 - Data Augmentation Based on CycleGAN for Improving Woodblock-printing Mongolian Words Recognition by Hongxi Wei, Kexin Liu, Jing Zhang and Daoerji Fan

P2-16 - #154 - Estimating Human Legibility in Historic Manuscript Images - A Baseline by Simon Brenner, Lukas Schügerl and Robert Sablatnig

P2-66 - #173 - Vectorization of Historical Maps Using Deep Edge Filtering and Closed Shape Extraction by Yizi Chen, Edwin Carlinet, Joseph Chazalon, Clément Mallet, Bertrand Duménieu and Julien Perret

P2-24 - #177 - Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs by Lars Vögtlin, Manuel Drazyk, Vinaychandran Pondenkandath, Michele Alberti and Rolf Ingold

P2-7 - #179 - Attention to Warp: Deep Metric Learning for Multivariate Time Series by Shinnosuke Matsuo, Xiaomeng Wu, Gantugs Atarsaikhan, Akisato Kimura, Kunio Kashino, Brian Kenji Iwana and Seiichi Uchida

P2-1 - #180 - A New Semi-Automatic Annotation Model via Semantic Boundary Estimation for Scene Text Detection by Zhenzhou Zhuang, Zonghao Liu, Kin-Man Lam, Shuangping Huang and Gang Dai

P2-19 - #186 - DSCNN: Dimension Separable Convolutional Neural Networks for character recognition based on inertial sensor signal by Fan Peng, Zhendong Zhuang and Yang Xue

P2-40 - #188 - SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models by Moonbin Yim, Yoonsik Kim, Han-Cheol Cho and Sungrae Park

P2-34 - #192 - Fast Text v. Non-text Classification of Images by Jiri Kralicek and Jiri Matas

P2-68 - #198 - SauvolaNet: Learning Adaptive Sauvola Network for Degraded Document Binarization by Deng Li, Yue Wu and Yicong Zhou

P2-27 - #199 - Complete Optical Music Recognition via Agnostic Transcription and Machine Translation by Antonio Ríos-Vila, David Rizo and Jorge Calvo-Zaragoza

P2-60 - #202 - Personalizing Handwriting Recognition Systems with Limited User-Specific Samples by Christian Gold, Dario van den Boom and Torsten Zesch

P2-50 - #208 - Key-guided Identity Document Classification Method by Graph Attention Network by Xiaojie Xia, Wei Liu, Ying Zhang, Liuan Wang and Jun Sun

P2-41 - #218 - Fast Recognition for Multidirectional and Multi-Type License Plates with 2D Spatial Attention by Qi Liu, Song-Lu Chen, Zhen-Jia Li, Chun Yang, Feng Chen and Xu-Cheng Yin

P2-63 - #221 - AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions by Martin Kišš, Karel Beneš and Michal Hradiš

P2-56 - #232 - Image Collation: Matching illustrations in manuscripts by Ryad Kaoua, Xi Shen, Alexandra Durr, Stavros Lazaris, David Picard and Mathieu Aubry

P2-31 - #241 - Which Parts Determine the Impression of the Font? by Masaya Ueda, Akisato Kimura and Seiichi Uchida

P2-42 - #244 - A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition by Qianying Liao, Qingxiang Lin, Lianwen Jin, Canjie Luo, Jiaxin Zhang, Dezhi Peng and Tianwei Wang

P2-58 - #247 - A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes by Samiul Alam, Tahsin Reasat, Asif Shahriyar Sushmit, Sadi Mohammad Siddique, Fuad Rahman, Mahady Hasan and Ahmed Imtiaz Humayun

P2-46 - #253 - VML-HP: Hebrew paleography dataset by Ahmad Droby, Berat Kurar Barakat, Daria Vasyutinsky Shapira, Irina Rabaev and Jihad El-Sana

P2-70 - #270 - Strikethrough Removal From Handwritten Words Using CycleGANs by Raphaela Heil, Ekta Vats and Anders Hast

P2-26 - #274 - Towards Book Cover Design via Layout Graphs by Wensheng Zhang, Yan Zheng, Taiga Miyazono, Seiichi Uchida and Brian Kenji Iwana

P2-55 - #283 - Fast End-to-end Deep Learning Identity Document Detection, Classification and Cropping by Guillaume Chiron, Florian Arrestier and Ahmad Montaser Awal

P2-32 - #287 - Impressions2Font: Generating Fonts by Specifying Impressions by Seiya Matsuda, Akisato Kimura and Seiichi Uchida

P2-13 - #289 - Applying End-to-end Trainable Approach on Stroke Extraction in Handwritten Math Expressions Images by Elmokhtar Mohamed Moussa, Thibault Lelore and Harold Mouchère

P2-51 - #296 - Document Image Quality Assessment via Explicit Blur and Text Size Estimation by Dmitry Rodin, Vasily Loginov, Ivan Zagaynov and Nikita Orlov

P2-52 - #297 - Analyzing the potential of Zero-Shot Recognition for Document Image Classification by Shoaib Ahmed Siddiqui, Andreas Dengel and Sheraz Ahmed

P2-71 - #305 - Iterative Weighted Transductive Learning for Handwriting Recognition by George Retsinas, Giorgos Sfikas and Christophoros Nikou

P2-61 - #307 - An Efficient Local Word Augment Approach for Mongolian Handwritten Script Recognition by Haoran Zhang, Wei Chen, Xiangdong Su, Hui Guo and Huali Xu

P2-57 - #319 - Revisiting the Coco Panoptic Metric to Enable Visual and Qualitative Analysis of Historical Map Instance Segmentation by Joseph Chazalon and Edwin Carlinet

P2-64 - #320 - TS-Net: OCR Trained to Switch Between Text Transcription Styles by Jan Kohút and Michal Hradiš

P2-62 - #331 - IIIT-INDIC-HW-WORDS: A Dataset for Indic Handwritten Text Recognition by Santhoshini Gongidi and C V Jawahar

P2-11 - #340 - TRACE: A Differentiable Approach to Line-level Stroke Recovery for Offline Handwritten Text by Taylor Archibald, Mason Poggemann, Aaron Chan and Tony Martinez

P2-43 - #346 - Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition by Mengmeng Cui, Wei Wang, Jinjin Zhang and Liang Wang

P2-8 - #347 - Customizable Camera Verification for Media Forensic by Huaigu Cao and Wael AbdAlmageed

P2-12 - #357 - Segmentation and graph matching for online analysis of student arithmetic operations by Arnaud Lods, Éric Anquetil and Sébastien Macé

P2-45 - #360 - Bidirectional Regression for Arbitrary-Shaped Text Detection by Tao Sheng and Zhouhui Lian

P2-33 - #361 - HRRegionNet: Chinese Character Segmentation in Historical Documents with Regional Awareness by Chia-Wei Tang, Chao-Lin Liu and Po-Sen Chiu

P2-20 - #370 - DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis by Sanket Biswas, Pau Riba, Josep Lladós and Umapada Pal

Comp-LT_01 - ICDAR 2021 Competition on Scientific Literature Parsing by Antonio Jimeno Yepes, Peter Zhong, Douglas Burdick

Comp-LT_02 - ICDAR 2021 Competition on Historical Document Classification by Mathias Seuret, Anguelos Nicolaou, Dalia Rodrı́guez-Salas, Nikolaus Weichselbaumer, Dominique Stutzmann, Martin Mayr, Andreas Maier, Vincent Christlein

 Comp-LT_03 - ICDAR 2021 Competition on Document Visual Question Answering by Rubèn Tito, Minesh Mathew, C.V. Jawahar, Ernest Valveny, Dimosthenis Karatzas

#WIADAR-1 - A deep learning digitisation framework to mark up corrosion circuits in piping and instrumentation diagrams by Luis Toral, Carlos Francisco Moreno-García, Eyad Elyan and Shahram Memon

#WIADAR-2 - Toward an incremental classification process of document stream using a cascade of systems by Joris Voerman, Ibrahim Souleiman Mahamoud, Aurélie Joseph, Mickaël Coustaty, Vincent Poulain D'Andecy and Jean-Marc Ogier

#WIADAR-3 - Automating Web GUI Compatibility Testing using X-BROT: Prototyping and Field Trial by Hiroshi Tanaka

#WIADAR-4 - Object Detection Based Handwriting Localization by Yuli Wu, Yucheng Hu and Suting Miao

#WIADAR-5 - Playful interactive environment for learning to spell at elementary school by Sofiane Medjram, Veronique Eglin, Stephane Bres, Adrien Piffaretti, Jobert Timothee

 

OCR: A Journey through Advances in the Science, Engineering, and Productization of AI/ML

Chair: Daniel Lopresti

From the very early years of AI, the problem of optical character recognition (OCR) has captured the imagination of researchers; Selfridge and Neisser presented an approach for OCR of hand printed characters in 1960. During last three decades, optical character recognition (OCR) technology for machine printed and handwritten text has evolved in significant ways – from script-specific techniques to script-independent methodologies, and from segmentation-based techniques to hidden Markov models to deep learning. In my talk, I will present my perspective on that evolution and it’s interplay with concomitant advances in speech recognition, natural language processing, and computer vision. The presentation will include a discussion of some practical, even if off the beaten path, applications of OCR technology, including work done in partnership with the census bureau in applying a deep learning based OCR framework to census forms. I will also share my views on some of the most interesting open problems in the field of OCR and document processing. The presentation will conclude with a few comments about one of my current areas of research interests – fairness in AI and machine learning.

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

Chairs: Elisa Barney-Smith, Vincent Poulain d'Andecy & Hiroshi Tanaka

In this session, we will get an overview of document analysis and recognition usage in industrial context. Our platinum sponsors, as well as, highlights from the WIADAR 2021 workshop will be presented. The posters of WIADAR 2021 will be displayed during Poster Session 2.

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

Cracking Ciphers with “AI-in-the-loop”: Transcription and Decryption in a Cross-Disciplinary Field

Chair: Joesep Llados

Accurate transcription of hand-written texts in images is indispensable in many research areas in digital humanities. Manual transcription is error-prone, time-consuming, and expensive to produce. Historical texts with their specific textual qualities require expert knowledge and trained eyes. During the past years, image processing applied to hand-written historical text documents to provide transcription output has been shown great opportunities, but also challenges for users. How can users without knowledge in AI in general and HTR in particular transcribe hand-written documents efficiently with ”AI-in-the- loop”?
In my talk, I will focus on encrypted manuscripts from Early Modern times with various symbols systems, hand-writing styles, and languages. The point of departure is the DECRYPT project, aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting images of ciphers and keys, to transcribe them, and to decrypt and contextualize those. I will give an overview of the project, the methods we use to solve various problems from transcription to decryption including historical corpora and natural language processing methods.

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

For online attendees: If you like, please join the virtual ICDAR lobby with break out rooms:

Zoom ID: 688 9006 7403

Passcode: 456789

Link: https://ltu-se.zoom.us/j/68890067403

Chairs: Harold Mouchere and Foteini Liwicki

ICDAR 2021 Competition overall presentation - Foteini Liwicki, Harold Mouchère

ICDAR 2021 Competition on Scientific Literature Parsing - Antonio Jimeno Yepes, Peter Zhong, Douglas Burdick

ICDAR 2021 Competition on Historical Document Classification - Mathias Seuret, Anguelos Nicolaou, Dalia Rodrı́guez-Salas, Nikolaus Weichselbaumer, Dominique Stutzmann, Martin Mayr, Andreas Maier, Vincent Christlein

ICDAR 2021 Competition on Document Visual Question Answering - Rubèn Tito, Minesh Mathew, C.V. Jawahar, Ernest Valveny, Dimosthenis Karatzas 

Chairs: Rolf Ingold, Andreas Fischer, Marcus Liwicki

In this session, the following Awards will be distributed:

ICDAR 2021 Best Paper Award

ICDAR 2021 Best Poster Award

ICDAR 2021 Best Student Paper Award

ICDAR 2021 Best Industry Paper Award

afterwards, there will be some concluding remarks