Skip to content

Commit b43ba99

Browse files
committed
feature #36929 Added a FrenchInflector for the String component (Alexandre-T)
This PR was merged into the 5.2-dev branch. Discussion ---------- Added a FrenchInflector for the String component I read in [this blog post](https://symfony.com/blog/new-in-symfony-5-1-deprecated-the-inflector-component) this sentence > Symfony Inflector component converts words between their singular and plural forms (**for now, only in English**) So I created a FrenchInflector class implementing the InflectorInterface from the String component. This inflector uses regular expressions and it is tested in the FrenchInflectorTest with a lot of the french exceptions. | Q | A | ------------- | --- | Branch | master | Bug fix | no | New feature | yes | Deprecations | no | License | MIT | Doc PR | Not yet Changelog has been updated, but I'm not sure I did it in the good paragraph. I don't know if I should update the symfony/symfony-docs, but I have created an example and I could create a PR with it, if you want. ```php <?php use Symfony\Component\String\Inflector\FrenchInflector; $inflector = new FrenchInflector(); $result = $inflector->singularize('dents'); // ['dent'] $result = $inflector->singularize('souris'); // ['souris'] $result = $inflector->singularize('messieurs'); // ['monsieur'] $result = $inflector->pluralize('cinquante'); // ['cinquante'] $result = $inflector->pluralize('pou'); // ['poux'] $result = $inflector->pluralize('cheval'); // ['chevaux'] ``` **fabbot.io** is detecting a typo, but this is not. The patch done by fabpot suggests to replace the french 'embarras' word by 'embarrass'. I shall not remove or replace it, because "embarras" is an invariant word. Commits ------- d903d9a757 Added a FrenchInflector for the String component French inflector implements InflectorInterface, it uses regexp and it is tested in the FrenchInflectorTest
2 parents b791456 + 70351fd commit b43ba99

File tree

3 files changed

+309
-0
lines changed

3 files changed

+309
-0
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
CHANGELOG
22
=========
33

4+
5.2.0
5+
-----
6+
7+
* added a `FrenchInflector` class
8+
49
5.1.0
510
-----
611

Inflector/FrenchInflector.php

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
namespace Symfony\Component\String\Inflector;
13+
14+
/**
15+
* French inflector.
16+
*
17+
* This class does only inflect nouns; not adjectives nor composed words like "soixante-dix".
18+
*/
19+
final class FrenchInflector implements InflectorInterface
20+
{
21+
/**
22+
* A list of all rules for pluralise.
23+
* @see https://la-conjugaison.nouvelobs.com/regles/grammaire/le-pluriel-des-noms-121.php
24+
*/
25+
private static $pluralizeRegexp = [
26+
// First entry: regexp
27+
// Second entry: replacement
28+
29+
// Words finishing with "s", "x" or "z" are invariables
30+
// Les mots finissant par "s", "x" ou "z" sont invariables
31+
['/(s|x|z)$/i', '\1'],
32+
33+
// Words finishing with "eau" are pluralized with a "x"
34+
// Les mots finissant par "eau" prennent tous un "x" au pluriel
35+
['/(eau)$/i', '\1x'],
36+
37+
// Words finishing with "au" are pluralized with a "x" excepted "landau"
38+
// Les mots finissant par "au" prennent un "x" au pluriel sauf "landau"
39+
['/^(landau)$/i', '\1s'],
40+
['/(au)$/i', '\1x'],
41+
42+
// Words finishing with "eu" are pluralized with a "x" excepted "pneu", "bleu", "émeu"
43+
// Les mots finissant en "eu" prennent un "x" au pluriel sauf "pneu", "bleu", "émeu"
44+
['/^(pneu|bleu|émeu)$/i', '\1s'],
45+
['/(eu)$/i', '\1x'],
46+
47+
// Words finishing with "al" are pluralized with a "aux" excepted
48+
// Les mots finissant en "al" se terminent en "aux" sauf
49+
['/^(bal|carnaval|caracal|chacal|choral|corral|étal|festival|récital|val)$/i', '\1s'],
50+
['/al$/i', '\1aux'],
51+
52+
// Aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail font leur pluriel en -aux
53+
['/^(aspir|b|cor|ém|ferm|soupir|trav|vant|vitr)ail$/i', '\1aux'],
54+
55+
// Bijou, caillou, chou, genou, hibou, joujou et pou qui prennent un x au pluriel
56+
['/^(bij|caill|ch|gen|hib|jouj|p)ou$/i', '\1oux'],
57+
58+
// Invariable words
59+
['/^(cinquante|soixante|mille)$/i', '\1'],
60+
61+
// French titles
62+
['/^(mon|ma)(sieur|dame|demoiselle|seigneur)$/', 'mes\2s'],
63+
['/^(Mon|Ma)(sieur|dame|demoiselle|seigneur)$/', 'Mes\2s'],
64+
];
65+
66+
/**
67+
* A list of all rules for singularize.
68+
*/
69+
private static $singularizeRegexp = [
70+
// First entry: regexp
71+
// Second entry: replacement
72+
73+
// Aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail font leur pluriel en -aux
74+
['/((aspir|b|cor|ém|ferm|soupir|trav|vant|vitr))aux$/i', '\1ail'],
75+
76+
// Words finishing with "eau" are pluralized with a "x"
77+
// Les mots finissant par "eau" prennent tous un "x" au pluriel
78+
['/(eau)x$/i', '\1'],
79+
80+
// Words finishing with "al" are pluralized with a "aux" expected
81+
// Les mots finissant en "al" se terminent en "aux" sauf
82+
['/(amir|anim|arsen|boc|can|capit|capor|chev|crist|génér|hopit|hôpit|idé|journ|littor|loc|m|mét|minér|princip|radic|termin)aux$/i', '\1al'],
83+
84+
// Words finishing with "au" are pluralized with a "x" excepted "landau"
85+
// Les mots finissant par "au" prennent un "x" au pluriel sauf "landau"
86+
['/(au)x$/i', '\1'],
87+
88+
// Words finishing with "eu" are pluralized with a "x" excepted "pneu", "bleu", "émeu"
89+
// Les mots finissant en "eu" prennent un "x" au pluriel sauf "pneu", "bleu", "émeu"
90+
['/(eu)x$/i', '\1'],
91+
92+
// Words finishing with "ou" are pluralized with a "s" excepted bijou, caillou, chou, genou, hibou, joujou, pou
93+
// Les mots finissant par "ou" prennent un "s" sauf bijou, caillou, chou, genou, hibou, joujou, pou
94+
['/(bij|caill|ch|gen|hib|jouj|p)oux$/i', '\1ou'],
95+
96+
// French titles
97+
['/^mes(dame|demoiselle)s$/', 'ma\1'],
98+
['/^Mes(dame|demoiselle)s$/', 'Ma\1'],
99+
['/^mes(sieur|seigneur)s$/', 'mon\1'],
100+
['/^Mes(sieur|seigneur)s$/', 'Mon\1'],
101+
102+
//Default rule
103+
['/s$/i', ''],
104+
];
105+
106+
/**
107+
* A list of words which should not be inflected.
108+
* This list is only used by singularize.
109+
*/
110+
private static $uninflected = '/^(abcès|accès|abus|albatros|anchois|anglais|autobus|bois|brebis|carquois|cas|chas|colis|concours|corps|cours|cyprès|décès|devis|discours|dos|embarras|engrais|entrelacs|excès|fils|fois|gâchis|gars|glas|héros|intrus|jars|jus|kermès|lacis|legs|lilas|marais|mars|matelas|mépris|mets|mois|mors|obus|os|palais|paradis|parcours|pardessus|pays|plusieurs|poids|pois|pouls|printemps|processus|progrès|puits|pus|rabais|radis|recors|recours|refus|relais|remords|remous|rictus|rhinocéros|repas|rubis|sas|secours|sens|souris|succès|talus|tapis|tas|taudis|temps|tiers|univers|velours|verglas|vernis|virus)$/i';
111+
112+
/**
113+
* {@inheritdoc}
114+
*/
115+
public function singularize(string $plural): array
116+
{
117+
if ($this->isInflectedWord($plural)) {
118+
return [$plural];
119+
}
120+
121+
foreach (self::$singularizeRegexp as $rule) {
122+
[$regexp, $replace] = $rule;
123+
124+
if (1 === preg_match($regexp, $plural)) {
125+
return [preg_replace($regexp, $replace, $plural)];
126+
}
127+
}
128+
129+
return [$plural];
130+
}
131+
132+
/**
133+
* {@inheritdoc}
134+
*/
135+
public function pluralize(string $singular): array
136+
{
137+
if ($this->isInflectedWord($singular)) {
138+
return [$singular];
139+
}
140+
141+
foreach (self::$pluralizeRegexp as $rule) {
142+
[$regexp, $replace] = $rule;
143+
144+
if (1 === preg_match($regexp, $singular)) {
145+
return [preg_replace($regexp, $replace, $singular)];
146+
}
147+
}
148+
149+
return [$singular.'s'];
150+
}
151+
152+
private function isInflectedWord(string $word): bool
153+
{
154+
return 1 === preg_match(self::$uninflected, $word);
155+
}
156+
}

Tests/FrenchInflectorTest.php

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
namespace Symfony\Component\String\Tests;
13+
14+
use PHPUnit\Framework\TestCase;
15+
use Symfony\Component\String\Inflector\FrenchInflector;
16+
17+
class FrenchInflectorTest extends TestCase
18+
{
19+
public function pluralizeProvider()
20+
{
21+
return [
22+
//Le pluriel par défaut
23+
['voiture', 'voitures'],
24+
//special characters
25+
['œuf', 'œufs'],
26+
['oeuf', 'oeufs'],
27+
28+
//Les mots finissant par s, x, z sont invariables en nombre
29+
['bois', 'bois'],
30+
['fils', 'fils'],
31+
['héros', 'héros'],
32+
['nez', 'nez'],
33+
['rictus', 'rictus'],
34+
['souris', 'souris'],
35+
['tas', 'tas'],
36+
['toux', 'toux'],
37+
38+
//Les mots finissant en eau prennent tous un x au pluriel
39+
['eau', 'eaux'],
40+
['sceau', 'sceaux'],
41+
42+
//Les mots finissant en au prennent tous un x au pluriel sauf landau
43+
['noyau', 'noyaux'],
44+
['landau', 'landaus'],
45+
46+
//Les mots finissant en eu prennent un x au pluriel sauf pneu, bleu et émeu
47+
['pneu', 'pneus'],
48+
['bleu', 'bleus'],
49+
['émeu', 'émeus'],
50+
['cheveu', 'cheveux'],
51+
52+
//Les mots finissant en al se terminent en aux au pluriel
53+
['amiral', 'amiraux'],
54+
['animal', 'animaux'],
55+
['arsenal', 'arsenaux'],
56+
['bocal', 'bocaux'],
57+
['canal', 'canaux'],
58+
['capital', 'capitaux'],
59+
['caporal', 'caporaux'],
60+
['cheval', 'chevaux'],
61+
['cristal', 'cristaux'],
62+
['général', 'généraux'],
63+
['hopital', 'hopitaux'],
64+
['hôpital', 'hôpitaux'],
65+
['idéal', 'idéaux'],
66+
['journal', 'journaux'],
67+
['littoral', 'littoraux'],
68+
['local', 'locaux'],
69+
['mal', 'maux'],
70+
['métal', 'métaux'],
71+
['minéral', 'minéraux'],
72+
['principal', 'principaux'],
73+
['radical', 'radicaux'],
74+
['terminal', 'terminaux'],
75+
76+
//sauf bal, carnaval, caracal, chacal, choral, corral, étal, festival, récital et val
77+
['bal', 'bals'],
78+
['carnaval', 'carnavals'],
79+
['caracal', 'caracals'],
80+
['chacal', 'chacals'],
81+
['choral', 'chorals'],
82+
['corral', 'corrals'],
83+
['étal', 'étals'],
84+
['festival', 'festivals'],
85+
['récital', 'récitals'],
86+
['val', 'vals'],
87+
88+
// Les noms terminés en -ail prennent un s au pluriel.
89+
['portail', 'portails'],
90+
['rail', 'rails'],
91+
92+
// SAUF aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail qui font leur pluriel en -aux
93+
['aspirail', 'aspiraux'],
94+
['bail', 'baux'],
95+
['corail', 'coraux'],
96+
['émail', 'émaux'],
97+
['fermail', 'fermaux'],
98+
['soupirail', 'soupiraux'],
99+
['travail', 'travaux'],
100+
['vantail', 'vantaux'],
101+
['vitrail', 'vitraux'],
102+
103+
// Les noms terminés en -ou prennent un s au pluriel.
104+
['trou', 'trous'],
105+
['fou', 'fous'],
106+
107+
//SAUF Bijou, caillou, chou, genou, hibou, joujou et pou qui prennent un x au pluriel
108+
['bijou', 'bijoux'],
109+
['caillou', 'cailloux'],
110+
['chou', 'choux'],
111+
['genou', 'genoux'],
112+
['hibou', 'hiboux'],
113+
['joujou', 'joujoux'],
114+
['pou', 'poux'],
115+
116+
//Inflected word
117+
['cinquante', 'cinquante'],
118+
['soixante', 'soixante'],
119+
['mille', 'mille'],
120+
121+
//Titles
122+
['monsieur', 'messieurs'],
123+
['madame', 'mesdames'],
124+
['mademoiselle', 'mesdemoiselles'],
125+
['monseigneur', 'messeigneurs'],
126+
];
127+
}
128+
129+
/**
130+
* @dataProvider pluralizeProvider
131+
*/
132+
public function testSingularize(string $singular, string $plural)
133+
{
134+
$this->assertSame([$singular], (new FrenchInflector())->singularize($plural));
135+
// test casing: if the first letter was uppercase, it should remain so
136+
$this->assertSame([ucfirst($singular)], (new FrenchInflector())->singularize(ucfirst($plural)));
137+
}
138+
139+
/**
140+
* @dataProvider pluralizeProvider
141+
*/
142+
public function testPluralize(string $singular, string $plural)
143+
{
144+
$this->assertSame([$plural], (new FrenchInflector())->pluralize($singular));
145+
// test casing: if the first letter was uppercase, it should remain so
146+
$this->assertSame([ucfirst($plural)], (new FrenchInflector())->pluralize(ucfirst($singular)));
147+
}
148+
}

0 commit comments

Comments
 (0)