Automated_paraphrasing

Paraphrasing (computational linguistics)

Automatic generation or recognition of paraphrased text

Paraphrase or paraphrasing in computational linguistics is the natural language processing task of detecting and generating paraphrases. Applications of paraphrasing are varied including information retrieval, question answering, text summarization, and plagiarism detection.^[1] Paraphrasing is also useful in the evaluation of machine translation,^[2] as well as semantic parsing^[3] and generation^[4] of new samples to expand existing corpora.^[5]

Paraphrase generation

Phrase-based Machine Translation

Paraphrase can also be generated through the use of phrase-based translation as proposed by Bannard and Callison-Burch.^[6] The chief concept consists of aligning phrases in a pivot language to produce potential paraphrases in the original language. For example, the phrase "under control" in an English sentence is aligned with the phrase "unter kontrolle" in its German counterpart. The phrase "unter kontrolle" is then found in another German sentence with the aligned English phrase being "in check," a paraphrase of "under control."

The probability distribution can be modeled as $\Pr(e_{2}|e_{1})$ , the probability phrase $e_{2}$ is a paraphrase of $e_{1}$ , which is equivalent to $\Pr(e_{2}|f)\Pr(f|e_{1})$ summed over all $f$ , a potential phrase translation in the pivot language. Additionally, the sentence $e_{1}$ is added as a prior to add context to the paraphrase. Thus the optimal paraphrase, ${\hat {e_{2}}}$ can be modeled as:

{\hat {e_{2}}}={\text{arg}}\max _{e_{2}\neq e_{1}}\Pr(e_{2}|e_{1},S)={\text{arg}}\max _{e_{2}\neq e_{1}}\sum _{f}\Pr(e_{2}|f,S)\Pr(f|e_{1},S)

$\Pr(e_{2}|f)$ and $\Pr(f|e_{1})$ can be approximated by simply taking their frequencies. Adding $S$ as a prior is modeled by calculating the probability of forming the $S$ when $e_{1}$ is substituted with $e_{2}$ .

Paraphrase recognition

Recursive Autoencoders

Paraphrase recognition has been attempted by Socher et al^[1] through the use of recursive autoencoders. The main concept is to produce a vector representation of a sentence and its components by recursively using an autoencoder. The vector representations of paraphrases should have similar vector representations; they are processed, then fed as input into a neural network for classification.

Given a sentence $W$ with $m$ words, the autoencoder is designed to take 2 $n$ -dimensional word embeddings as input and produce an $n$ -dimensional vector as output. The same autoencoder is applied to every pair of words in $S$ to produce $\lfloor m/2\rfloor$ vectors. The autoencoder is then applied recursively with the new vectors as inputs until a single vector is produced. Given an odd number of inputs, the first vector is forwarded as-is to the next level of recursion. The autoencoder is trained to reproduce every vector in the full recursion tree, including the initial word embeddings.

Given two sentences $W_{1}$ and $W_{2}$ of length 4 and 3 respectively, the autoencoders would produce 7 and 5 vector representations including the initial word embeddings. The euclidean distance is then taken between every combination of vectors in $W_{1}$ and $W_{2}$ to produce a similarity matrix $S\in \mathbb {R} ^{7\times 5}$ . $S$ is then subject to a dynamic min-pooling layer to produce a fixed size $n_{p}\times n_{p}$ matrix. Since $S$ are not uniform in size among all potential sentences, $S$ is split into $n_{p}$ roughly even sections. The output is then normalized to have mean 0 and standard deviation 1 and is fed into a fully connected layer with a softmax output. The dynamic pooling to softmax model is trained using pairs of known paraphrases.

Evaluation

Multiple methods can be used to evaluate paraphrases. Since paraphrase recognition can be posed as a classification problem, most standard evaluations metrics such as accuracy, f1 score, or an ROC curve do relatively well. However, there is difficulty calculating f1-scores due to trouble producing a complete list of paraphrases for a given phrase and the fact that good paraphrases are dependent upon context. A metric designed to counter these problems is ParaMetric.^[20] ParaMetric aims to calculate the precision and recall of an automatic paraphrase system by comparing the automatic alignment of paraphrases to a manual alignment of similar phrases. Since ParaMetric is simply rating the quality of phrase alignment, it can be used to rate paraphrase generation systems, assuming it uses phrase alignment as part of its generation process. A notable drawback to ParaMetric is the large and exhaustive set of manual alignments that must be initially created before a rating can be produced.

The evaluation of paraphrase generation has similar difficulties as the evaluation of machine translation. The quality of a paraphrase depends on its context, whether it is being used as a summary, and how it is generated, among other factors. Additionally, a good paraphrase usually is lexically dissimilar from its source phrase. The simplest method used to evaluate paraphrase generation would be through the use of human judges. Unfortunately, evaluation through human judges tends to be time-consuming. Automated approaches to evaluation prove to be challenging as it is essentially a problem as difficult as paraphrase recognition. While originally used to evaluate machine translations, bilingual evaluation understudy (BLEU) has been used successfully to evaluate paraphrase generation models as well. However, paraphrases often have several lexically different but equally valid solutions, hurting BLEU and other similar evaluation metrics.^[21]

Metrics specifically designed to evaluate paraphrase generation include paraphrase in n-gram change (PINC)^[21] and paraphrase evaluation metric (PEM)^[22] along with the aforementioned ParaMetric. PINC is designed to be used with BLEU and help cover its inadequacies. Since BLEU has difficulty measuring lexical dissimilarity, PINC is a measurement of the lack of n-gram overlap between a source sentence and a candidate paraphrase. It is essentially the Jaccard distance between the sentence, excluding n-grams that appear in the source sentence to maintain some semantic equivalence. PEM, on the other hand, attempts to evaluate the "adequacy, fluency, and lexical dissimilarity" of paraphrases by returning a single value heuristic calculated using N-grams overlap in a pivot language. However, a large drawback to PEM is that it must be trained using large, in-domain parallel corpora and human judges.^[21] It is equivalent to training a paraphrase recognition to evaluate a paraphrase generation system.

The Quora Question Pairs Dataset, which contains hundreds of thousands of duplicate questions, has become a common dataset for the evaluation of paraphrase detectors.^[23] Consistently reliable paraphrase detection have all used the Transformer architecture and all have relied on large amounts of pre-training with more general data before fine-tuning with the question pairs.

Share this article:

This article uses material from the Wikipedia article Automated_paraphrasing, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[Socher-1] [1]
Socher, Richard; Huang, Eric; Pennington, Jeffrey; Ng, Andrew; Manning, Christopher (2011), "Advances in Neural Information Processing Systems 24", Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection, archived from the original on 2018-01-06, retrieved 2017-12-29

[Callison-2] [2]
Callison-Burch, Chris (October 25–27, 2008). Syntactic Constraints on Paraphrases Extracted from Parallel Corpora. EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii. pp. 196–205.

[3] [3]
Berant, Jonathan, and Percy Liang. "Semantic parsing via paraphrasing." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2014.

[4] [4]
Wahle, Jan Philip; Ruas, Terry; Kirstein, Frederic; Gipp, Bela (2022). "How Large Language Models are Transforming Machine-Paraphrase Plagiarism". Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Online and Abu Dhabi, United Arab Emirates. pp. 952–963. arXiv:2210.03568. doi:10.18653/v1/2022.emnlp-main.62.{{cite book}}: CS1 maint: location missing publisher (link)

[Barzilay-5] [5]
Barzilay, Regina; Lee, Lillian (May–June 2003). Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment. Proceedings of HLT-NAACL 2003.

[Bannard-6] [6]
Bannard, Colin; Callison-Burch, Chris (2005). Paraphrasing Bilingual Parallel Corpora. Proceedings of the 43rd Annual Meeting of the ACL. Ann Arbor, Michigan. pp. 597–604.

[Prakash-7] [7]
Prakash, Aaditya; Hasan, Sadid A.; Lee, Kathy; Datla, Vivek; Qadir, Ashequl; Liu, Joey; Farri, Oladimeji (2016), Neural Paraphrase Generation with Staked Residual LSTM Networks, arXiv:1610.03098, Bibcode:2016arXiv161003098P

[8] [8]
Zhou, Jianing; Bhat, Suma (2021). "Paraphrase Generation: A Survey of the State of the Art". Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics. pp. 5075–5086. doi:10.18653/v1/2021.emnlp-main.414. S2CID 243865349.

[9] [9]
Dou, Yao; Forbes, Maxwell; Koncel-Kedziorski, Rik; Smith, Noah; Choi, Yejin (2022). "Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text". Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Computational Linguistics: 7250–7274. arXiv:2107.01294. doi:10.18653/v1/2022.acl-long.501. S2CID 247315430.

[10] [10]
Liu, Xianggen; Mou, Lili; Meng, Fandong; Zhou, Hao; Zhou, Jie; Song, Sen (2020). "Unsupervised Paraphrasing by Simulated Annealing". Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics: 302–312. arXiv:1909.03588. doi:10.18653/v1/2020.acl-main.28. S2CID 202537332.

[11] [11]
Wahle, Jan Philip; Ruas, Terry; Meuschke, Norman; Gipp, Bela (2021). "Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection". 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL). Champaign, IL, USA: IEEE. pp. 226–229. arXiv:2103.12450. doi:10.1109/JCDL52503.2021.00065. ISBN 978-1-6654-1770-9. S2CID 232320374.

[12] [12]
Bandel, Elron; Aharonov, Ranit; Shmueli-Scheuer, Michal; Shnayderman, Ilya; Slonim, Noam; Ein-Dor, Liat (2022). "Quality Controlled Paraphrase Generation". Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Computational Linguistics: 596–609. arXiv:2203.10940. doi:10.18653/v1/2022.acl-long.45.

[13] [13]
Lee, John Sie Yuen; Lim, Ho Hung; Carol Webster, Carol (2022). "Unsupervised Paraphrasability Prediction for Compound Nominalizations". Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, United States: Association for Computational Linguistics. pp. 3254–3263. doi:10.18653/v1/2022.naacl-main.237. S2CID 250390695.

[14] [14]
Niu, Tong; Yavuz, Semih; Zhou, Yingbo; Keskar, Nitish Shirish; Wang, Huan; Xiong, Caiming (2021). "Unsupervised Paraphrasing with Pretrained Language Models". Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics. pp. 5136–5150. doi:10.18653/v1/2021.emnlp-main.417. S2CID 237497412.

[Kiros-15] [15]
Kiros, Ryan; Zhu, Yukun; Salakhutdinov, Ruslan; Zemel, Richard; Torralba, Antonio; Urtasun, Raquel; Fidler, Sanja (2015), Skip-Thought Vectors, arXiv:1506.06726, Bibcode:2015arXiv150606726K

[16] [16]
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (2019). "Proceedings of the 2019 Conference of the North". Proceedings of the 2019 Conference of the North. Minneapolis, Minnesota: Association for Computational Linguistics: 4171–4186. doi:10.18653/v1/N19-1423. S2CID 52967399.

[17] [17]
Wahle, Jan Philip; Ruas, Terry; Foltýnek, Tomáš; Meuschke, Norman; Gipp, Bela (2022), Smits, Malte (ed.), "Identifying Machine-Paraphrased Plagiarism", Information for a Better World: Shaping the Global Future, vol. 13192, Cham: Springer International Publishing, pp. 393–413, arXiv:2103.11909, doi:10.1007/978-3-030-96957-8_34, ISBN 978-3-030-96956-1, S2CID 232307572, retrieved 2022-10-06

[18] [18]
Nighojkar, Animesh; Licato, John (2021). "Improving Paraphrase Detection with the Adversarial Paraphrasing Task". Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online: Association for Computational Linguistics. pp. 7106–7116. doi:10.18653/v1/2021.acl-long.552. S2CID 235436269.

[19] [19]
Dopierre, Thomas; Gravier, Christophe; Logerais, Wilfried (2021). "ProtAugment: Intent Detection Meta-Learning through Unsupervised Diverse Paraphrasing". Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online: Association for Computational Linguistics. pp. 2454–2466. doi:10.18653/v1/2021.acl-long.191. S2CID 236460333.

[Burch2-20] [20]
Callison-Burch, Chris; Cohn, Trevor; Lapata, Mirella (2008). ParaMetric: An Automatic Evaluation Metric for Paraphrasing. Proceedings of the 22nd International Conference on Computational Linguistics. Manchester. pp. 97–104. doi:10.3115/1599081.1599094. S2CID 837398.

[Chen-21] [21]
Chen, David; Dolan, William (2008). Collecting Highly Parallel Data for Paraphrase Evaluation. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon. pp. 190–200.

[Liu-22] [22]
Liu, Chang; Dahlmeier, Daniel; Ng, Hwee Tou (2010). PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts. Proceedings of the 2010 Conference on Empricial Methods in Natural Language Processing. MIT, Massachusetts. pp. 923–932.

[23] [23]
"Paraphrase Identification on Quora Question Pairs". Papers with Code.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

Automated_paraphrasing

Paraphrasing (computational linguistics)

Paraphrase generation

Multiple sequence alignment

Phrase-based Machine Translation

Long short-term memory

Transformers

Paraphrase recognition

Recursive Autoencoders

Skip-thought vectors

Transformers

Evaluation

See also

References

External links

Share this article: