Wikidata:Suggester ranking input
Development plan | Usability and usefulness | Status updates | Development input | Contact the development team |
We are currently looking into improving the search tool behind the search field (top right of each page) and the entity suggester (for example when you edit a statement). We are aware that some of the suggestions that appear and their ranking in the list is not always what you expect, and we would like to know more about this.
This page collects examples of current behavior of the suggester. Feel free to use the form below to add some more, and to let other kind of feedback on the talk page. You can find archived requests on /Archive.
Note: this is not about Special:Search. To report an issue with the search features, please use this page.
General behaviour: items that fit value type property constraint should come first
edit- Property: Any property that uses value-type constraint (Q21510865). given name (P735) is a good example.
- What I type: e.g. "John"
- What I get: various things with "John" in their names, but not John (Q4925477).
- What I would expect: John (Q4925477), which fits the constraint, should come first.
- Details: Because given name (P735) has value-type constraint (Q21510865), items that fit into the class relation in the constraint should come first.
- Suggested by: Deryck Chan (talk) based on my earlier proposal above and User:billinghurst's suggestion on the talk page
Ranking of painting (Q3305213) for P31
edit- Property: instance of (P31)
- What I type: "painting"
- What I get: art of painting (Q11629) in first position, and painting (Q3305213) in second position
- What I would expect: painting (Q3305213) in first position
- Details: In the WikiProject sum of all paintings we naturally create a lot of items about individual paintings... However, for me at least, the first suggestion for "painting" is confused between two meanings of the word. As it says in the EN.WP article for Painting: "In art, the term painting describes both the act and the result of the action". It is FAR more likely that I'll be trying to create a new 'instance of' a physical object (a painting) than an 'instance of' an action.
- Suggested by: Wittylama (talk) 08:16, 25 December 2017 (UTC)
Adding reference URL (P854) as a reference
edit- Property: reference URL (P854)
- What I type: Once I place the cursor in the property-input box in the "+ add reference" dialogue, before I type anything
- What I get: The system proposes 4 things to me. They are, in order: stated in (P248), retrieved (P813) PubMed publication ID (P698), and PMC publication ID (P932).
- What I would expect: I would expect reference URL (P854) at the top of these suggested first four options.
- What I type #2: "R" and "Re" [on my way to typing "reference URL"
- What I get #2: retrieved (P813) in first position (I don't even think this is a valid property in this situation) and reference URL (P854) is only in second position.
- Details: Furthermore, After I have input and saved a URL as a reference, and then click "+ add" the first automatically proposed property (before I start typing anything in the input box) is stated in (P248). At this point I would now expect that retrieved (P813) should be the first suggestion - given that it was the first suggestion earlier in this process. In fact, I would prefer that retrieved (P813) and 'today's date' be added AUTOMATICALLY - or at least a "add today as the access date" be a one-click proposed suggestion when someone adds a Ref URL.
- @Wittylama: there is gadget javascript currentDate that works for "Retrieved" — billinghurst sDrewth 23:17, 3 April 2018 (UTC)
- Suggested by: Wittylama (talk) 15:02, 27 December 2017 (UTC)
Height and Width
edit- Property: height (P2048) and width (P2049)
- What I type: Once I have added a number for the property for either "height" or "width" (usually in centimetres, when creating an item about a painting), a "Unit (optional)" dialogue appears. I start typing c-e-n-t.... looking to add centimetre (Q174728)
- What I get: With each new letter that I write of "centimetre" it progressively suggests: carbon (Q623), then Sri Lanka (Q854) [because of the alias [Ceylon], then Cen (Q6812691), then cent (Q58093), then centi (Q108478), and then finally centimetre (Q174728).
- What I would expect: I would expect that this dialogue should prioritise Q items which are a " subclass of -> unit of measurement (Q47574)"
- Suggested by: Wittylama (talk) 16:50, 27 December 2017 (UTC)
- @Wittylama: I think the general behaviour we want here is that the input suggester for units should prioritise suggestions that fit the P2237 (P2237) stipulations for that property. Deryck Chan (talk) 18:23, 20 January 2018 (UTC)
- P2237 has been superseded by allowed units constraint (Q21514353). The rest of this suggestion remains the same. Deryck Chan (talk) 13:55, 7 February 2020 (UTC)
- @Wittylama: I think the general behaviour we want here is that the input suggester for units should prioritise suggestions that fit the P2237 (P2237) stipulations for that property. Deryck Chan (talk) 18:23, 20 January 2018 (UTC)
Search results for string "nap"
edit- Search field
- What I type: nap
- What I get: Neapolitan (Q33845) (alias nap), Naples International Airport (Q849383) (alias NAP), CTNNBL1 (Q18042346) (alias NAP), Naples (Q2634), Indianapolis (Q6346) (alias Naptown), SSC Napoli (Q2641), Napoleon (Q517)
- What I would expect: nap (Q901586) and nap (Q5242962) in top few results
- Details: I would expect an exact match on the entered string in my language's Label to appear at or near the top of the results list.
- Suggested by: PKM (talk) 20:20, 2 January 2018 (UTC)
- This seems to happen because other items have much more sitelinks and incoming links, thus dominating over nap (Q5242962). Usually number of sitelinks and inlinks is a good criteria for quality match, but not in this case. Smalyshev (WMF) (talk) 05:42, 9 January 2018 (UTC)
- @Smalyshev (WMF), PKM: This opens up a wider question: Are there ways to prioritise exact title matches in either the search results or the suggester? If the label of an item I want is an exact substring of a bunch of more popular items, there's currently no way of stopping the item I want from being drowned out. Deryck Chan (talk) 18:20, 20 January 2018 (UTC)
- @Deryck Chan: I would support prioritizing exact string matches in label and aliases in all cases. - PKM (talk) 19:53, 20 January 2018 (UTC)
P31
edit- Property:
- What I type: click "add statement"
- What I get: nothing
- What I would expect: instance of (P31) or subclass of (P279) suggested.
- Details: any items on Category:P641 only reports
- Suggested by:
--- Jura 13:27, 5 March 2018 (UTC)
Pubmedification
edit- Property:
- What I type: add statement P31=film and P577=2017
- What I get: property for Pubmed identifier
- What I would expect: director, etc. other film related properties
- Details: It may be that essentially the property suggestor assumes that any item in Wikidata is an item for an article imported from pubmed as soon as one or the other property used on these is present.
- Suggested by:
--- Jura 15:35, 25 March 2018 (UTC)
Family Name
edit- Property: family name (P734)
- What I type: a family name
- What I get: lots of matches
- What I would expect: the match that is instance of (P31) family name (Q101352) to be top of the list
- Details: Example: adding family name (P734) = Somerville (Q16883559) to Annesley Somerville (Q4769027).
This really applies for any property with a property constraint (P2302) = value-type constraint (Q21510865) either that its value should belong to a class of items, or to a particular list of items given in the constraint. Items that match the allowed values should appear at the top of the list. I don't expect the scan to scan up the whole subclass of (P279) tree to see whether a value is in an allowed class. But it should be possible for it to identify direct instance of (P31) hits, and put them at the top of the list. (Addressing the closed-list case would also solve the gender:male issue.) - Suggested by: Jheald (talk) 17:47, 6 April 2018 (UTC)
- Comment: Yes, this is really a duplicate of Deryck Chan's "General behaviour" points above. But it is a key issue with the suggester at the moment -- it is not offering contextual suggestions, only general suggestions, which are quite likely either a) inappropriate or b) considerably sub-optimal. Identifiably appropriate suggestions should come first, especially as the input string starts to become more specific. Jheald (talk) 17:55, 6 April 2018 (UTC)
Category items and P301
edit- Property:
- What I type:
- On an item for a category with P31 and 1 statement with category combines topics (P971), I add another P971 statement
- What I get:
- The suggestion for category's main topic (P301) disappears for additional statements.
- What I would expect:
- No impact on P301.
- Details:
- Somehow the second statement with P971 changes the ranking.
- Suggested by:
--- Jura 15:26, 1 June 2018 (UTC)
No suggestions on instances of template
edit- Property: instance of (P31) Wikimedia template (Q11266439)
- What I type: nothing
- What I get: no suggestion
- What I would expect: at least template has topic (P1423)
- Details:
- Suggested by:
--- Jura 09:28, 6 June 2018 (UTC)
Property constraint
edit- Property: property constraint
- What I type: single constraint
- What I get: no match was found
- What I would expect: single value constraint
- Details:
- Suggested by: Jonas Kress (WMDE) (talk) 13:27, 4 July 2018 (UTC)
- Completion search is a prefix search, so it can't find "single value constraint". Full search does find it. Smalyshev (WMF) (talk) 23:17, 5 July 2018 (UTC)
General behavior: items that fit one-of constraint should come first
edit- Property: any property with constraint type one-of constraint (Q21510859)
- What I type: for refine date (P4241), type “third quar”
- What I get: third quarter (Q27927518): bottom left (dexter) quarter in quartering, in heraldry, third quarter (Q40719662): should be used with qualifier P4241 to narrow down the period it describes, 75th Hunger Games (Q14136446): fictional event
- What I would expect: third quarter (Q40719662): should be used with qualifier P4241 to narrow down the period it describes is first
- Details: I use first half, second half, first quarter, second quarter, etc. (of a century) all the time when entering inception dates for textile artworks like tapestries. The correct value is always suggested, but it isn’t reliably the first (or the second) item, so it really slows down entering data. When there is a “one-of” constraint (as there is for refine date (P4241)), either correct values should appear first, or only correct values should be suggested.
- Suggested by: PKM (talk) 01:43, 7 February 2020 (UTC)
QRank
editWould QRank be useful for suggester ranking? (Disclaimer: I wrote it; apologies for the shameless plug). --Sascha (talk) 10:48, 19 March 2021 (UTC)
- Thanks for the suggestion! Please see a more detailed response here: Wikidata:Contact_the_development_team/Query_Service_and_search#QRank. DCausse (WMF) (talk) 07:50, 22 March 2021 (UTC)