Keyboard Shortcuts

Keyboard shortcuts are available for common actions and site navigation.

Skip to content
  • Home Home Home, current page.
  • About
  • Have an account? Log in
_dmh's profile
Dave 🦔 is spooky at #INLG2019 🎃
Dave 🦔 is spooky at #INLG2019 🎃
Dave  🦔 is spooky at #INLG2019  🎃
@_dmh

Tweets

Dave  🦔 is spooky at #INLG2019  🎃

@_dmh

Computational linguist teaching computers to use their words. 🇿🇦🇺🇸 in 🇬🇧 Queer&Catholic. He/him. Tweets=sci, tech, politics, etc. More spam on @_dmh_freq

davehowcroft.com
Joined May 2008

Tweets

  • © 2019 Twitter
  • About
  • Help Center
  • Terms
  • Privacy policy
  • Cookies
  • Ads info
  1. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 46m46 minutes ago

    Keynote this morning at #INLG2019: Philipp Koehn with "Challenges for Neural Sequence Generation Models - Insights from Machine Translation" #MachineTranslation is #NaturalLanguageGeneration inspired by other ideas

    1 reply 1 retweet 4 likes
    Show this thread
  2. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 44m44 minutes ago

    Koehn: 20 years ago at USC we ran an experiment giving humans small amounts of parallel data and asking them to do translations. With world knowledge and strong language models, humans are quite good at this.

    1 reply 0 retweets 0 likes
    Show this thread
  3. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 43m43 minutes ago

    Koehn: following up in 2010 ("Enabling monolingual translation"), gave humans an original text with MT guesses for phrase by phrase translations. Monolinguals are able to do fairly well in creating a translation choosing from these options.

    1 reply 0 retweets 0 likes
    Show this thread
  4. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 42m42 minutes ago

    Koehn: more evidence for language models (LMs), better language models leads to better translations. Edinburgh's 2013 WMT system trained on 126 billion tokens, required 1 TB RAM to translate, but with a 0.8 point increase in BLEU.

    1 reply 0 retweets 0 likes
    Show this thread
  5. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 40m40 minutes ago

    Koehn: When I went to Hopkins I bought a high-RAM computer & some speech researchers wanted GPUs - I didn't understand at the time, but I learnt soon.

    1 reply 0 retweets 1 like
    Show this thread
  6. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 40m40 minutes ago

    Koehn: Now we're in the age of neural MT (NMT) and we typically see grammatically well-formed translations w/good ability to handle long-distance agreement.

    1 reply 0 retweets 0 likes
    Show this thread
  7. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 38m38 minutes ago

    Koehn: just better fluency? Sennrich & Haddow (2017 EAMT tutorial) looked at their recent top NMT model comparing to statistical MT and found 1% improvement in adequacy; 13% improvement in fluency.

    1 reply 0 retweets 0 likes
    Show this thread
    Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh

    Koehn: Of course we saw headlines about Google's NMT when it was producing nonsense based on the training data (talking about the second coming of Jesus, etc). Now we'll talk a bit more about "hallucinations", when neural MT goes bad, and how this relates to NLG

    6:15 PM - 30 Oct 2019
    1 reply 0 retweets 0 likes
      1. New conversation
      2. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 31m31 minutes ago

        Koehn: neural MT breaks even on BLEU with SMT after about 100 million words of training data.pic.twitter.com/oADaFP0GUG

        2 replies 0 retweets 0 likes
        Show this thread
      3. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 30m30 minutes ago

        See some examples of how limited training data results in much worse NMT translationspic.twitter.com/X0sxTvqtUi

        1 reply 0 retweets 0 likes
        Show this thread
      4. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 29m29 minutes ago

        The green bars here are all a bit shorter than the blue, indicating that it does worse out of domain than SMTpic.twitter.com/WmnRnHwm7u

        1 reply 0 retweets 0 likes
        Show this thread
      5. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 27m27 minutes ago

        Koehn: Used to be that expanding the beam for SMT yielded consistent improvements at the cost of more compute time. However if you expand the beam too much you start seeing clear degradations in NMT quality

        1 reply 0 retweets 0 likes
        Show this thread
      6. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 26m26 minutes ago

        Koehn: We explored the impact of noise on NMT compared to SMT and found across a variety of kinds of noise that NMT is less robust to noise than SMT.

        1 reply 0 retweets 0 likes
        Show this thread
      7. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 25m25 minutes ago

        Here's the most extreme example: NMT suffers hugely if you have untranslated sentences in your corpuspic.twitter.com/jM3QWr68uy

        1 reply 0 retweets 0 likes
        Show this thread
      8. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 23m23 minutes ago

        Koehn: To address these problems we created the WMT 2019 filtering task, where the goal is to filter out noisy sentence pairs from the data before training. Looking at systems trained on different amounts of noise we see the same trend across a variety of systems: SMT > NMT

        1 reply 0 retweets 0 likes
        Show this thread
      9. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 22m22 minutes ago

        Koehn: What is happening? Why is this happening? Visualization, probing internal states, and trying to trace back decisions to inputs are approaches we are taking to try to understand the problem.

        1 reply 0 retweets 0 likes
        Show this thread
      10. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 18m18 minutes ago

        Koehn: One analysis looking to detect hallucinations, we looked at KL-divergence between the predictions of the NMT model for the next word and the LM. High KL-divergence indicates instances where the input matters more than the LM. This approach did not really succeed :/

        1 reply 0 retweets 0 likes
        Show this thread
      11. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 17m17 minutes ago

        Koehn: So the next thing we tried was based on the idea of saliency from vision. Idea - if changes in input cause changes in output, then the input mattered. (makes me wonder how the size of the differences between the input and output relate to one another)

        1 reply 0 retweets 0 likes
        Show this thread
      12. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 15m15 minutes ago

        Koehn: "Just looking at attention weights often doesn't tell you much", based on Ding et al. 2019 @ WMT, where they showed saliency was more informative wrt word alignments

        1 reply 0 retweets 0 likes
        Show this thread
      13. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 13m13 minutes ago

        Koehn: Where to next? We could look at this as an ML problem, arguing that it's overfitting on the data, failing to generalize to data outside the model's comfort zone. Or perhaps arguing that it's 'exposure bias' - the models never see bizarre data during training time...

        1 reply 0 retweets 0 likes
        Show this thread
      14. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 13m13 minutes ago

        ... so then it doesn't know how to cope with this scenario at test time if it starts a sentence with an odd prefix.

        1 reply 0 retweets 0 likes
        Show this thread
      15. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 12m12 minutes ago

        Koehn: So why don't we make the data more diverse? We can do data synthesis (e.g. by vocabulary replacement). I watered my flowers -> I/We/You watered my roses/plants/etc.

        1 reply 0 retweets 0 likes
        Show this thread
      16. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 12m12 minutes ago

        Koehn: We can also look at paraphrasing, paraphrasing source, target, or both, and creating new training instances this way.

        1 reply 0 retweets 0 likes
        Show this thread
      17. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 11m11 minutes ago

        Koehn: Paraphrasing helps more when you paraphrase the target side (Hu et al. 2019 @ ACL, ongoing work)

        1 reply 0 retweets 0 likes
        Show this thread
      18. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 9m9 minutes ago

        Another approach is backtranslationpic.twitter.com/VwQMekO597

        1 reply 0 retweets 0 likes
        Show this thread
      19. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 8m8 minutes ago

        Koehn: So far we've been talking about changing data, which is nice, but we can also change how we do ML. For example, adding noise is now a standard practice (e.g. dropout and label smoothing)

        1 reply 0 retweets 0 likes
        Show this thread
      20. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 7m7 minutes ago

        Koehn: So what's the outlook? Open Questions - why are neural models overfitting? - can we detect hallucinations? - can we counter-act this behavior? with diversity in training data? with better machine learning?

        1 reply 0 retweets 0 likes
        Show this thread
      21. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 5m5 minutes ago

        QA Discussion: what are your thoughts on pipelines? can this help with some of these generalization issues? these days the trend is in the other direction: folks even want to do audio-to-audio MT. so far the syntax-/phrase-based approaches we used to use have not helped with NMT

        1 reply 0 retweets 0 likes
        Show this thread
      22. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 3m3 minutes ago

        QA Discussion: detecting hallucinations in target sentences, have you thought about quality in the input? usually we have to translat what we have to translate

        1 reply 0 retweets 0 likes
        Show this thread
      23. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 1m1 minute ago

        QA Discussion: one diff between NLG and MT is that we're starting w/structured meanings as opposed to text-to-text where the meaning may be more ambiguous. the parallel would MT w/back translation similarity to source text. some folks have looked at this...

        1 reply 0 retweets 0 likes
        Show this thread
      24. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 1m1 minute ago

        ...but usually after realization is complete rather than online, recognizing where things go wrong as they go wrong

        1 reply 0 retweets 0 likes
        Show this thread
      25. Dave  🦔 is spooky at #INLG2019  🎃‏ @_dmh 12s12 seconds ago

        QA Discussion: interested in saliency aspect, which has been used a bit in image description/captioning. to what extent can you learn more from manipulating the source texts? images can allow much more subtle manipulations than text

        0 replies 0 retweets 0 likes
        Show this thread
      26. End of conversation

      • © 2019 Twitter
      • About
      • Help Center
      • Terms
      • Privacy policy
      • Cookies
      • Ads info