Speak About the Topic for 130 round·English Tests·Hard·20 min

Duolingo English Test Speaking — Speak About the Topic for 130

Start the interview now · ₹9920 min · 1 credit · scorecard at the end

ZeroPitch Research Desk

Updated 2026-05-16 · Newly added · be among the first to take it

Field: English Tests
Company: Duolingo English Test (DET)
Role: Duolingo English Test Speaking Candidate
Duration: 20 min
Difficulty: Hard
Completions: New
Updated: 2026-05-16

What this round is about

Topic focus. A timed Speak About the Photo and topic-talk drill: 20 seconds to plan, then up to 90 seconds to describe an image or talk about a question, calibrated to the Duolingo English Test 130 bar.
Conversation dynamic. A former DET speaking rater gives you a prompt, lets you speak for the full window, then debriefs and pushes on the single weakest part of your answer before the next prompt.
What gets tested. Task relevance to the actual image, full use of the speaking time, grammatical range, lexical range, and clear steady pronunciation.
Round format. Four speaking prompts that escalate from a calm scene to a people-and-action photo to a no-image topic talk, then a short reflection on your own delivery.

What strong answers look like

Overview first. A one-sentence summary of the whole scene in the first ten to fifteen seconds, for example: This photograph shows a busy street market on a sunny morning.
Colour and location detail. Every object paired with a colour and a position, for example: on the left there are wooden carts stacked with green vegetables and yellow bananas.
Actions in present continuous. People described as doing things now, for example: a woman in an orange sari is buying vegetables while shoppers are walking behind her.
Full-time coverage with a closing inference. The answer runs close to ninety seconds and ends with a speculation framed as likelihood, for example: overall the scene looks lively, which suggests it might be a weekend morning.

What weak answers look like (and how to avoid them)

Stopping early. Ending at thirty or forty seconds caps the band; keep adding real detail until the time is used.
Off-image content. Describing people or objects that are not in the picture breaks task relevance; describe only what is actually there.
Memorised intro. A generic opening that ignores the real image triggers the memorised-sounding penalty; adapt your overview to this specific scene.
Noun lists. Listing isolated objects with no verbs reads as low range; turn each object into a sentence with an action.

Pre-interview checklist (2 minutes before you start)

Recall the answer arc. Overview, foreground with colour and position, background, one inference, in that order.
Pull up your spatial words. In the foreground, in the background, on the left, on the right, in the centre, behind, in front of, next to.
Have your colour and action vocabulary ready. A bank of colours and present-continuous verbs so detail comes automatically under time pressure.
Identify your pause habit. Decide now that you will paraphrase rather than stop if a word does not come.
Think of your topic-talk shape. Position, two or three specific reasons, restate, for the no-image prompt.

How the AI behaves

Probes every answer. It debriefs each response and asks at least one follow-up before the next prompt, never accepting the first attempt as final.
No mid-interview praise. It will not say great answer or validate; it names what you actually did and pushes on the weak dimension.
Interrupts on drift and on short answers. It calls out the exact moment you went off-image, ran short, or repeated a sentence pattern.
Stays in character. It speaks as a former rater coaching to 130, never as a machine and never reading you the template before you attempt the answer.

Common traps in this type of round

Short answer. Finishing well under ninety seconds because you ran out of planned detail.
Template mismatch. Reciting a fixed script whose contents do not match the actual photo.
Noun dump. Naming objects without turning them into sentences with actions.
Rambling. Jumping between ideas with no main point and no connectors between them.
Long pauses. Silences over two seconds while searching for a word, which read as low fluency.
Rushing. Speaking so fast that pronunciation blurs and the answer becomes hard to follow.

Interview framework

You will be scored on these 6 dimensions. The full rubric with definitions is below.

Task Relevance To Prompt

Whether everything you say is actually in the given image or on the given topic, with zero invented or off-image content.

22%

Full-time Coverage

Whether you use close to the full ninety seconds with continuous on-topic detail rather than stopping early or padding.

20%

Descriptive Structure

Whether you open with an overview, layer foreground then background detail, and close with an inference instead of listing at random.

20%

Grammatical Range And Accuracy

Whether you vary simple, compound and complex sentences with connectors rather than repeating one pattern.

18%

Lexical And Spatial Precision

Whether you pair objects with colour and position words and choose precise vocabulary over generic nouns.

12%

Fluency Recovery Under Pressure

Whether you paraphrase and self-correct smoothly instead of pausing over two seconds or restarting when a word does not come.

What we evaluate

Your final scorecard breaks down across these dimensions. The full rubric and tier criteria are revealed inside the interview itself.

Common questions

What does the DET Speak About the Photo round actually test?

It tests whether you can produce an organised, extended spoken description of an image under time pressure. After 20 seconds of planning you speak for up to 90 seconds. The Duolingo English Test scores it on task relevance, fluency and pace, grammatical range and accuracy, lexical range, and pronunciation. A 130 needs an answer that covers the whole image systematically, uses colour and location detail, runs close to the full time, and never describes something that is not in the picture.

How should I structure a 90-second Speak About the Photo answer?

Use a repeatable arc. In the first ten to fifteen seconds give a one-sentence overview of the whole scene. Then spend the bulk of the time on the foreground, naming people and objects with their colour and position and describing actions with present-continuous verbs. Move to the background and setting. Close with one inference about the situation or mood, framed as likelihood. The arc keeps you on-task and stops you running out of ideas at forty seconds.

What is a good DET score and why does 130 matter for Indian students?

The DET is scored from 10 to 160 in five-point bands. A 130 sits at the C1 threshold on the CEFR scale and is the practical competitive bar for top universities and many graduate programs. For Indian study-abroad applicants the test is attractive because it costs roughly 59 to 70 US dollars, is taken at home in about an hour, and returns results within 48 hours, far faster and cheaper than IELTS or TOEFL.

What are the most common mistakes that keep speaking below 130?

Stopping after thirty or forty seconds instead of using the full ninety, reciting a memorised intro that does not match the actual image, listing isolated nouns with no full sentences, rambling with no main point, leaving pauses longer than two seconds, speaking so fast that pronunciation blurs, and describing people or objects that are not in the picture. Each of these caps the speaking subscore and pulls the overall below 130.

How is this practice different from the real automated DET scorer?

The real DET uses automated AI scoring with no human in the loop. This rehearsal puts a former speaking rater in front of you who reacts the way the scoring model behaves: rewarding full-time coverage, structure, range and clarity, and penalising short, off-image, or memorised answers. You get a spoken debrief and a transcript-backed scorecard instead of a single number, so you can hear exactly where an answer would have lost band.

How does the speaking subscore affect my overall and integrated scores?

Since July 2024 the DET reports eight subscores. Your overall is the average of the four individual subscores, including Speaking, so a weak speaking performance caps a 130 even if reading and writing are strong. Speaking also feeds two integrated subscores: Conversation, which is Speaking plus Listening, and Production, which is Writing plus Speaking. Many universities read these integrated subscores for program fit, so one short answer can drag three numbers down at once.

What should I do in the 20 seconds of planning time?

Do not write a script. Scan the image and pick three describable zones: the main subject, the foreground detail, and the background. Note one colour and one action you will mention in each zone, and one closing inference. That gives you a route through the full ninety seconds so you never freeze mid-answer. The goal of planning is a path, not a paragraph.

How do I keep talking for the full 90 seconds without padding?

Add detail, not filler. Every object gets a colour and a location, every person gets an action in present continuous, and the setting gets weather, time of day, and atmosphere. Then close with a speculation about who the people are or why the scene matters. Task relevance is scored, so describing more of the real image is safe while inventing extra content or repeating yourself is not.

What does a strong topic-talk answer sound like when there is no photo?

Open with a clear position or main point in one sentence, give two or three specific reasons or examples with concrete detail rather than abstractions, link them with connectors so the answer progresses rather than jumps, and close by restating your point. Use the full time, vary sentence structure between simple, compound and complex, and keep a steady pace so pronunciation stays clear.

How do I recover if I blank or use the wrong word mid-answer?

Do not stop and do not start over. Paraphrase around the missing word and keep moving, because a pause longer than two seconds signals the scoring model that you are stuck and lowers fluency. If you say a wrong word, self-correct in one short phrase and continue. Raters reward a candidate who repairs smoothly far more than one who freezes in search of the perfect word.

Why do memorised templates lower the score on the DET?

The scoring model checks that your response shows real language ability tied to the actual prompt. A generic opening like in this photo we can see various things, or a script that ignores what is really in the image, triggers the off-topic and memorised-sounding penalty. A flexible structure you adapt to each specific image is safe; a fixed paragraph you repeat regardless of the picture is not.