This paper has now been included in AAAI 2019 . In their paper, the authors state that they are the first team to successfully use a CNN decoder for code generation . So the question arises: What is the difference between using the CNN decoder for code generation, compared to the previous approach? And what's so special about their models? What's so great about the results? Below, we'll answer each of these questions . Advantages of generating code with CNN decoder Generating code based on natural language descriptions is quite a difficult thing to do . Nowadays, it is usual to use recurrent neural networks ( RNN) for sequence generation, to generate a poem, for machine translation, no problem . But when it comes to generating code, the "problem" comes . There is a lot of structured information in the program that is important to model the program, but traditional Seq2Seq neural networks, do not explicitly model the program structure . Take for example this Python abstract syntax tree (AST) below .
among others，n3 harmonyn6 The two nodes should interact closely as parent-child nodes， However, if you use the traditionalSeq2Seq approach， It would lead them to“ Father and son separated”， Stay away from each other.。 To solve this problem， A lot of people are starting to think of various ways。 One of the key methods is to use convolutional neural networks(CNN)， After all, they're efficient.， Training is also easy。 This essay， Just a representative。 And it was the first to succeed in bringingCNN Decoder for code generation， Quite a watershed。 In the paper， The author also describes that， This is more than the originalRNN Much better.。 The main point is that： Half of the input programs are much longer than natural language sentences， granted thatRNN haveLSTM（long short-term memory） enrichment， It will also always be plagued by dependency issues。 but (not)CNN Not so much.， This can be done by sliding the window（slide window） Effectively capturing the characteristics of different areas。 that， How was this model designed?？ Model design The paper presented in theCNN， is a syntax-based structuredCNN。 The model will be based on theAST The syntax structure rules for generating code for， And it also predicts the order of grammar rules， Final construction of the entire program。 that， How do they predict grammar rules?？ Based on three main types of information： Specify the source sequence of the program to be generated、 Previously predicted grammar rules and the part that has been generatedAST。 The first one is well understood， is the input to the encoder。 Mandates for the latter two， is to enable the decoder to self-regress（autoregressiveness）， And the decoder is also conditional on the encoder。 In order to make this structuredCNN Better suited for code generation， They also designed several different components： firstly， Tree-based convolution idea， (located) atAST Structural application of sliding windows。 then (afterwards)， Designing anotherCNN Module to partAST The nodes in the preorder traversal。 These two types ofCNN Not only does it capture the sequence of“ neighbours” information， Also capture the tree structure in the“ neighbours” information。 second， will be anotherCNN Module applied to the ancestor of the node to be generated， Let the network know， Where to generate in a step。 thereby enhancing“ autoregressive”。 thirdly， Designing specialized attention mechanisms， willCNN The characteristics of the differentCNN Module for interaction。 moreover， The author states， Consider scope names in the code generation process( for example， Function and method names) It's useful.， So it uses information like this as a controller for several pool layers。 thus， One arrives at such a model。
Model Overview . The dashed arrows indicate the attention controller . How effective is this model, really? Model Effectiveness The authors evaluated the effectiveness of the model using two tasks . One generates Python code for the Hearthstone Legends game, and one generates executable logic forms for semantic parsing . Generate Python code for Hearthstone Legends This task uses the Hearthstone Legends benchmark dataset, which includes a total of 665 different cards . inputs are semi-structured descriptions of fields, such as card names, costs, attacks, descriptions and other attributes .
The output to be output is a Python code snippet that implements the card function .
The quality of the model was measured by accuracy versus BLEU score . In terms of accuracy, the authors trace the same approach as most previous studies, calculating accuracy based on string matching (denoted as StrAcc ) . Sometimes several generated programs use different variable names, but the function is correct, which requires human adjustment . and the accuracy of the artificial adjustment is indicated by Acc . Finally, the quality of the generated code is evaluated using the BLEU values . The results are shown in the following figure .
outperformed all previous models in terms of accuracy and BLEU scores . StrAcc is 5 percentage points higher than the previous best model . The human-adjusted Acc reached 30 .3 percent, an increase of 3 percentage points, and the previous model's best result improved by 2 percent . The authors argue that this shows the validity of their approach . As for the similarity of the previous model to theirs in terms of BLEU scores, the authors explain that code generation still depends on the details . Semantic Parsing Task In the semantic parsing task, two semantic parsing datasets ( ATIS and JOBS ) are used, where the input is natural language sentences . The output of ATIS is in λ-actor form, while for JOBS, the output is in Prolog form .
In both datasets, the model proposed in the paper does not show much advantage .
In the paper, the authors suggest that this may be because the logical forms of semantic parsing are usually short, and thus both RNNs and CNNs can generate logical forms . However, this experiment also demonstrates the generality and flexibility of code generation with CNNs . After all, the whole model is basically designed for long programs and is good at semantic parsing . About the authors In order of authorship, the authors are Zeyu Sun, Qihao Zhu, Lili Mou, Yingfei Xiong, Ge Li, and Lu Zhang, of which Yingfei Xiong is the corresponding author . The author is with the School of Information Science and Technology, Peking University .